Studying the Topology of Transportation Systems through Complex Networks: Handle with Care

,


Introduction
In recent years, the topological structure of different transportation systems has become an important topic of research.This is the result of the convergence of two different lines of work.On one hand, the improvement in computational and data storage resources has allowed the transportation research community to gain access to large amount of real data, enabling the detailed description of those systems at different time and spatial scales.On the other hand, there has been a great effort from the statistical physics community in analysing the structure and dynamics of both theoretical and real complex networks [1][2][3].It then became clear that most complex systems, i.e., those composed of multiple interacting elements, cannot be fully understood by a reductionist approach, in which the composing elements are studied in an independent fashion.In order to understand and predict the collective (or emergent) dynamics, it is instead necessary to include information about how those elements interact between them, and about how different connectivity patterns influence such dynamics.
The convergence of both research fields has resulted in a paradigm shift in the way transportation systems are conceptualised and analysed.It became clear that these are complex systems and that the focus ought to be moved from one transportation unit (e.g., an aircraft, a car, or a bus) to the global structure of interconnections that those units generate.Consequently, the generation and absorption of delays stop being local phenomena, i.e., the result of the dynamics of a single aircraft, for becoming a propagation process conceptually similar to disease spreading.Similarly, the cancellation of a flight or the closure of an airport can be studied for their global consequences, i.e., the changes in the mobility patterns across the whole system, instead of including just a quantification of the number of directly affected passengers.
Although fruitful, this convergence is also hiding pitfalls and difficulties.These come from two fronts.Firstly, complex network theory was not developed with a specific application in mind, but it is instead a general framework for understanding interacting systems.A statistical physicist must then take into account the fact that not all complex network concepts are applicable to the transportation context and that some adaptation may be required.Secondly, even if prima facie simple, complex network theory is based on a strong mathematical scaffolding that cannot be circumvented.The transportation scientist must then be aware of many theoretical requirements, such as the application of suitable statistical tests, to ensure the obtention of meaningful results.
Within the hundreds of contributions that have appeared in the last decade about the use of complex networks to understand transportation systems, a significant number of them presents one or more problems that make it difficult to interpret their results.These problems are not limited to trivial research works: on the contrary, they can be found in recent publications and in highly respected journals.In this work we aim at fostering a debate around them, by raising awareness in the scientific community and eventually at helping developing novel solutions.For the sake of compactness, this debate has been focused on the topological properties of transportation systems, for being the most basic and easily understandable application of complex network theory.These problems have been organised around two major topics: (i) the assessment of topological properties of the networks, including scale-freeness (Section 2) and other basic characteristics (Section 3), and (ii) the study of the robustness and resilience of transportation systems, in terms of both the used metrics (Section 4) and terminology (Section 5).Six real-world datasets are further used to illustrate these pitfalls.We finally draw conclusions in Section 6.

Assessing the Scale-Freeness of Transport Networks
2.1.Common Pitfalls and Misleading Interpretations.Originally, two types of graphs were extensively studied: regular ones, in which all nodes have the same degree (i.e., number of connections) and random graphs, whose connectivity is completely random and thus in which node degrees follow a Poisson distribution.One of the most important discoveries in complex networks theory and the one that distinguished it from the mathematics' graph theory is the realisation that nodes in real-world networks are not homogeneous: on the contrary, they usually display richer connectivity patterns.Specifically, it has been found that many nodes only have a handful of connections, while a few of them (called hubs) may be connected with the majority of their peers.The result is a scale-free distribution of the degree of nodes, which can be approximated by a power law () ≈  − [4].Such heterogeneity in the nodes' importance is also present in transport networks.Nodes are not all the same, with some of them being much more important than others.On one hand, this may be due to the way the network is constructed, with some nodes designed to connect different parts of the system.But it can also be the result of economical reasons, as, e.g., in the case of airports serving big cities and thus collecting a larger demand and of historical reasons, as the case of ports or of specific maritime routes, being important because of their past [5].It is thus natural to pose the question of whether transportation networks are also scale-free.
An open topic of discussion within statistical physics is when we can confidently define a network to be scale-free (see, for instance, [6,7]) and how this can be translated to fields like, for instance, biology [8][9][10].Historically, such analysis has been performed by plotting the degree distribution in a log-log scale and by verifying that such distribution approximately follows a straight line.This may nevertheless be misleading, as a log-log scale flattens most perturbations, such that many different distributions may therefore seem power laws.On the other hand, a more statistically sound analysis requires two conditions: a network size large enough to span several orders of magnitude in the node degrees and the execution of a statistical test, as will be discussed in Section 2.2.
With respect to the size requirement, it is easy to see that most of the air transport networks do not fulfil it, as the number of airports in a country or even in a supranational region seldom reaches the thousands.In spite of this, scalefreeness has been claimed for the Italian [11] (42 airports), Indian [12] (79 airports), the Brazilian [13] (120 airports), or the Chinese network [14] (128 airports).The situation is even worse in the case of road networks, in which the physical nature of the graph implies that the degree of each node is limited, as, for instance, it is difficult to plan a crossroad where more than six streets converge.In spite of this, [15] compares two fits for the degree distribution, according, respectively, to a power law and an exponential function, even though the maximum degree in the network is 6 and the minimum is 3.
In order to confirm the presence of a scale-freeness distribution of the degrees, the most common approach has been to resort to a graphical representation.Plenty of examples can be found in the literature, for maritime [16][17][18], road [19][20][21], and rail networks [8,[22][23][24][25][26][27][28][29].Beyond such graphical fit, some interesting examples may also be highlighted.Specifically, [30], while analysing the evolution of global liner shipping networks between years 1996 and 2006, reports an exponent  varying from −1.351 to −1.293 without describing how these values were obtained.Reference [31] concludes that the maritime network is scale-free without any calculation at all: "nearly 80% of nodes account for less than 20% of the respective accumulative values of the degree of the nodes, just like scale-free properties".In the analysis of urban street networks, [32] states that "the investigation of how well the fat-tailed distribution can fit power law in comparing with other distributions (e.g., log-normal and exponential) shows that no significant evidence is found for scale-free feature in the dual space"; nevertheless no statistical evidence of any kind is provided.Finally, [33] identifies several street networks as scale-free and reports a goodness-of-fit: yet there is no explanation on how this last metric is computed, making it thus impossible to reproduce these results.Not all research works suffer from this bias towards scalefreeness, and some noteworthy examples can be found.For instance, [34] correctly discards the scale-free structure in favour of an exponential distribution of degrees for the air transportation network.Reference [35], when analysing the temporal evolution of the Brazilian air network, states that "a reasonable fit is obtained by using a stretched exponential", although no statistical analysis is provided.Finally, [36] correctly recognises that, even though there is a "suggestive scaling behavior" in the distribution of node degrees in maritime networks, "simple models for generating scale-free statistics are not sufficient to describe these empirical networks"; similar careful observations have been made for travel demand networks at the urban scale [37][38][39][40] and locationbased analysis of data from social media [41].
It is clear that the claim of the scale-free nature of many transportation networks has not been supported by suitable statistical tests.It is nevertheless undeniable that nodes are not homogeneous and that some of them attract most of the connections and traffic.Thus, even if these networks are not scale-free, they still present a scale-free like structure and display a long-tailed degree distribution.How does this impact the operational analysis of the system?In other words, how do the conclusions of the previously mentioned papers have to be changed, if the networks are long-tailed instead of scale-free?In simple terms, no effect is to be expected.
In order to understand this point, one has to take into account the fact that scale-free networks are a mathematical simplification, or model, of real-world networks.Defining an exact law for the degree distribution allows finding analytical solutions to problems like the dynamics of diseases [42] or voters [43], through a heterogeneous mean field approximation.These problems can nevertheless still be analysed when networks are not exactly scale-free by means of numerical simulations.Furthermore, as node degrees are indeed heterogeneous and follow a long-tailed distribution, all subsequent conclusions will still hold, like the importance of the central airports for the delay propagation or for the robustness of the system.
In synthesis, assessing the scale-freeness of a transportation network requires a solid statistical validation.If such validation cannot be performed, for instance, because of the limited network size, it is better to avoid any mention to a scale-free topology, as this would largely be irrelevant.Putting it simply, and in spite of its lure, there is more life beyond scale-freeness.

Recommended Solution.
As previously introduced, there are two problems preventing an easy assessment of the scalefreeness of real-world networks: their limited size and the fact that statistical validations of the fits are seldom performed.
As for the first issue, it has been found that even perfect fits cannot be accepted as statistically significant when the number of samples (in this case, of nodes) is below 100 [44]; and, as a rule of thumb, scale-freeness should be accepted only when the degrees span several orders of magnitude.Therefore, not even the best statistical analysis can support the scale-free hypothesis for the Italian air transport network, composed of 42 nodes [11] nor for a network whose maximum degree is 6 [15].
Regarding the second issue, i.e., the design of a statistical test, we here tackle it through three different techniques.To illustrate, these techniques are applied to the airport and bus networks described in Appendix.Power law and exponential fits of the degree distribution of both networks are reported in Figure 1, while the values of the statistical tests are reported in Table 1.
First of all, one may be tempted to use the goodnessof-fit  2 , a metric which is conceptually simple, easy to compute, and well understood in the case of linear models.It is nevertheless known that the  2 metric is unreliable for nonlinear models, as here it does not hold that the total sumof-squares is equal to the regression sum-of-squares plus the residual sum-of-squares.Negative results may then appear, indicating that the nonlinear fit is worse than a simple average [45], as is the case in Table 1 for the bus network.Also, the linearisation of the model prior to its evaluation, for instance, by taking the logarithm of the node degree, is not a good solution: the resulting  2 would represent the goodness of the linearised model and not of the original (nonlinear) one [46].
A second option entails resorting to the Akaike Information Criterion (AIC), a metric estimating the relative quality of a statistical model given some empirical data [47].The AIC is based on calculating the Kullback-Leibler Divergence (KLD) between the values yielded by the model and the real available data, for then adjusting the value to compensate for the number of free parameters, in order to avoid overfitting.It is important to note that, while effective, the AIC returns a relative value, i.e., a value that can be used to compare different models (and decide which one is preferable) but not to assess the quality of a single model.Thus AIC can be used to choose between different types of nonlinear fits, but not to assess the statistical significance of one of them.
The third and best solution requires performing a full statistical test on the model, in order to obtain a  value, which is then used to accept or reject the fit.Let us suppose

Power law Exponential
Figure 1: Example of seemly scale-free networks.The left and right panels, respectively, depict the cumulative degree distribution of the airport and the bus networks, as described in Appendix.Red and green dashed lines further depict the best power law and exponential fits of the two distributions.
that a model has already been fitted, such that a function () (yielding the probability of finding a node of degree ) is available.As a first step, one needs to define a distance between the fitted model and the real data, i.e., how much they are dissimilar; this can be easily done through a Kolmogorov-Smirnov (KS) statistic.Afterwards, it is necessary to generate a large number of synthetic datasets using the fitted model () and for all of them calculate the corresponding KS statistic.Finally, one should count the fraction of the time the synthetic statistic is larger than the value obtained for the real dataset: such fraction will be the final  value.As can be seen from Table 1, no considered fit, being it power law or exponential, succeeds to pass this statistical test, with the obtained  values being in all cases very close to 1.0.
One final note should be added.In the previous analysis, the  value has been obtained supposing that the model () describes the full distribution of possible degrees.This is nevertheless not always the case, as, for instance, the scalefreeness can be detected within a specific range of degrees, or the best model can be a truncated power law.The creation of the synthetic datasets should then be adapted to take this into account.For instance, let us consider the case of a truncated power law, in which the scale-free nature is observed only above   .The synthetic datasets should account for this fact: below   they should mimic the real dataset, while above it they can be created using the fit ().
The interested reader may find an excellent review of this third solution, along with some practical examples, in [44].

Common Pitfalls and Misleading Interpretations.
Once a network is obtained, the next logical step is to calculate a set of topological metrics to assess specific aspects of the structure, including the presence of triangles (i.e., transitivity or clustering coefficient), connectivity, and so forth.It is nevertheless important to understand how these values are affected by the network size, especially when one needs to compare multiple systems.
Let us explore this issue through a simple example.An important metric for a transportation system is the efficiency, defined as the inverse of the harmonic mean of the geodesic distance between nodes [48]: , being the distance between nodes  and  and  the total number of nodes in the network.The efficiency measures how fast information (or any other element) can be transmitted in a network; thus, a value close to one indicates that most passengers can move between two nodes by means of direct connections.
It is straightforward to see how this metric is influenced by the number of links present in the network.Increasing the number of flights in an air transport network would also increase the number of passengers able to reach their destination directly.In the limit of all airports being connected with all other ones,  will become one, indicating a perfect transport efficiency.It is important to note that a given value of  is the result of the interaction between two aspects: the internal structure of the network and its link density.Therefore, a high value of  does not imply an efficient network design.
If most topological metrics suffer from this dependency on the number of links composing the network, some of them are also defined as a function of the number of nodes.This is the case, for instance, of the diameter, defined as the shortest distance between the two most distant nodes in the network, and of the average path length [49].Clearly, larger networks will, in principle, have larger diameters and path lengths than smaller ones.How does this map to the problem of analysing a transportation network?First of all, conclusions cannot be drawn from the values of the topological metrics unless these are properly normalised, i.e., transforming to account for the number of nodes and links in the network.Secondly, comparisons can only be made on normalised values.To illustrate this point, we once again rely on three of the networks described in Appendix, specifically, the light rail, subway, and tram networks.Note that these have been chosen because of their comparable characteristics and sizes.Table 2 reports the values of several topological characteristics, both before and after a normalisation using random equivalent networks (i.e., composed of the same number of nodes and links, as further described in Section 3.2).
Several interesting observations can be obtained.First of all, the efficiency seems to be substantially higher (to be precise, 48% higher) in the light rail (0.0979) than in the tram network (0.0659).This is nevertheless not accounting for the higher link density of the former: once normalised, the latter network appears as more efficient than the first (−25.46versus −28.20).Note that the highly negative values of the normalised metric indicate that these networks have not been optimised for direct connections, a message that is difficult to extract from the raw values.The opposite situation can be found in the modularity, i.e., a metric characterising the presence of communities: if the tram network seems to be more modular than the light rail, the situation is reversed once the values are properly normalised.
In synthesis, the values of topological metrics are seldom relevant per se; instead, they need to be normalised, both to simplify their interpretation and to enable comparisons between different networks.It is worth noting that many works published in the transportation context have omitted this step and have thus incurred in important interpretation errors.
For instance, [34] states that "the average path length of 2.23 in the [air transportation network of China] is very similar to that of India's air transport system (2.26) and slightly above that of Italy's (1.98-2.14),but larger than that of the US (ranging from 1.84 to 1.93)".Yet, these four networks are completely heterogeneous, in terms of both number of nodes (from 50 for Italy to 272 for US) and link densities (from 0.124 for Italy to 0.0729 for India).Obtaining similar average path lengths for China and India, when the latter has almost the double link density, actually indicates that their structure is substantially different.Other nonnormalised comparisons have been reported in [50][51][52].A synthesis of this problem in the case of air transport can be found in Table 1 of [53]: among the 14 surveyed papers, only six normalised the average path length, and nine the clustering coefficient.It is also noteworthy that two works normalised the first metric, but not the second, even if the problem here described applies to both of them [54,55].
Moving to street networks, [33] compares two topological metrics (clustering coefficient and average path length) for six cities and three other networks, in spite of having very heterogeneous link densities (from 2.47 ⋅ 10 −4 to 4.22 ⋅ 10 −3 ) and even in spite of being conceptually different networks (representing streets, proteins, or the Internet).A similar problem can be found in [56].
Reference [36] compares the global cargo-ship network to the worldwide air transportation network, by considering the unnormalised version of metrics like the diameter or the clustering coefficient.The similar values obtained in the last case (0.57 versus 0.55) lead the authors to highlight "a surprising degree of similarity of both networks", in spite of the latter having a link density one order of magnitude higher than the former.Subsequent works based on similar data, as, for instance, [57,58], did not solve the problem.

Recommended Solution.
In a first approximation, normalising a topological metric is not a complex task.In synthesis, one needs to generate a large set of networks (called the null model) that lack the topological structure to be tested, for then seeing how the real network deviates from this set.
Let us suppose that we calculated a topological metric  over a network G, composed of  nodes and  links, obtaining the value .A simple normalisation can be obtained through a Z-Score, defined as () represents the values yielded by the metric  on a large set of null model networks ; and < ⋅ > and (⋅) are the average and standard deviation operators.
How should then this null model be defined?As the standard objective is to compare the real network against something that has no clear structure, the simplest solution entails using random Erdős-Rényi networks, with the same number of nodes and links.This may nevertheless yield biased results.To illustrate, suppose one is studying a street network, which is by definition planar; in other words, when two streets intersect, a link between them is necessarily created.Additionally, let us suppose that streets are built at random.Would this result in a lack of structure?Surprisingly, no: triangles would be very common, as any triplets of long streets, not parallel between them, would sooner or later intersect and form a triangle.If random networks are then used to normalise the transitivity metric, the result would be a very high Z-Score.Additionally, let us consider airport networks.While they lack the planar property, still their construction is guided by some principles that should be included in the null model: for instance, the fact that airports closer than 300 km are seldom connected by a direct flight.Once again, the use of a set of completely random networks may yield biased results.
In spite of the clear shortcomings associated with the use an Erdős-Rényi model, no accepted alternative is available for transportation systems, and the topic is still a matter of debate in other scientific disciplines [59,60].

Identifying Node Importance by Arbitrarily
Chosen Network Metrics

Common Pitfalls and Misleading Interpretations.
Since the release of the ground-breaking studies on complex networks and their properties, it has often been found that the failure of a small fraction of elements in these networks might lead to a cascade effect which, when related to critical infrastructure, would result in major disruptions in our society.A few examples of such extensive, wide-ranging network failures include large-scale power outages in the United States [61], cross-continental supply-chain shortages in the Japanese 2011 tsunami aftermath [62], or the disruption of the European airspace after computer failures at Eurocontrol in April 2018.In all these events, the affected regions suffered extremely high economic costs [50,63,64].Moreover, as infrastructure systems are becoming densely connected and dependent, the potential impact of failures is increasing to an unprecedented level.Therefore, analysing the robustness of networks and their interactions is of tremendous importance.The robustness of a network is usually estimated based on the critical fraction of all nodes that, once removed, will cause a sudden disintegration [65].From the perspective of statistical physics, this process is rather well-investigated for random network models [66][67][68][69][70]. Yet, when it comes to the analysis of real-world network instances, it becomes more complicated.The major reason is that, for real-world networks, not all nodes and links perfectly fit into a predefined network model.Hence, when estimating the node importance for robustness, the statistical measures can go wrong.Over the years, multiple methods have been proposed for measuring the disintegration of a network over time.Perhaps the most frequently used method is to measure the relative reduction in the size of the giant component (or largest connected component) of the network; the rationale being that the functionality of a network strongly correlates with the number of connected nodes.We highlight an example in Figure 2. Since one is often interested in a single quantification measure for the robustness of a network, most related works use the robustness measure R [71].Given a network composed of  nodes, the value of  is defined as where () is the size of the giant component after removing  nodes.Essentially, this procedure assesses how many nodes are contained in the giant component once a node is deleted from the network, while iterating over all nodes in the network.
While trying to quantify the robustness of a network, it is critical to understand that there exists no single robustness value R. For the computation of R, we need a node ranking as input, defining the sequence in which nodes are removed from the network.Different sequences, in general, induce different network disintegration patterns.Thus, an inappropriate choice of node sequence leads to unfounded conclusions regarding the actual robustness of a network.The design of such an order is far from trivial, given the large number of possible node orderings in real-world networks, i.e., a network with  nodes has ! different node orderings.Therefore, existing studies often choose node sequences based on heuristics.Perhaps most known outside the core complex network research area are so-called network metrics, which assign scores to nodes, depending on properties derived at the micro/meso/macroscale.All nodes are ranked in order of the metric values (usually in decreasing order of importance).We discuss a few of these network metrics.
DEG attacks the nodes in order of their decreasing degree, i.e., the number of direct neighbors.The degree is only recorded one time in the beginning and not updated during the disruption process.
BETW (betweenness centrality [72]) measures the number of times a node appears on the shortest path between all pairs of nodes in the network.Nodes are removed with decreasing centrality scores.
CLOS (closeness centrality) measures the average shortest path distance of a node to all other nodes in the network.Nodes are removed with increasing centrality scores, given that a smaller closeness value indicated a closer relationship to all nodes in the network.
EIG (eigenvector centrality) measures the centrality of a node based on the centrality of its neighbors (see [73] for discussion on the concept).Nodes are removed with decreasing centrality scores.PR (Pagerank [74]) was originally designed as an algorithm to rank websites based on the link structure.In our experiments we use a variant on undirected networks.Nodes are removed with increasing centrality scores.KATZ (Katz centrality [75]) measures the centrality of a node based on the relative influence of nodes regarding direct neighbors and also all other nodes in the network that connect to the node through these direct neighbors.Nodes are removed with increasing centrality scores.
These metrics (and similar ones) have been used in many existing studies in order to analyse the robustness of transportation networks.In Figure 3, we show an example for a DEG-based attack to a network.Some of the studies [76,77] only evaluate DEG of nodes for designing a targeted attack, e.g., stating that "for the selective attack strategy, we remove some nodes with higher degrees according to their degree order from high to low" [77].Others compare a set of few static network metrics, but without considering interactive/iterative/dynamic metrics sufficiently for much stronger attacks [26,[78][79][80][81][82] (see Section 4.2 for further discussion).For few studies, the authors do not reveal which kind of targeted attack they use: "We have simulated an attack on every network in our database by blocking travel through targeted stations" [83].There are only a few notable exceptions, which correctly use interactive betweenness as a reference for network disruption simulation, e.g., [84][85][86][87].Papers published in transportation journals rarely consider advanced network dismantling methods, emerging throughout the last 2-3 years.It is interesting, however, that those papers introducing novel network dismantling methods, which rather appear in the complex network community, often take the worldwide airport network as a real-world case study [88,89].

Recommended Solution.
Previous metrics are based on an initial estimation of node importance in the original network.Yet, throughout the dismantling process, the roles of nodes in a network can change significantly.With the elimination of a (critical) node from the network, shortest paths between other nodes often change completely.Therefore, it is recommended to recompute a network metric throughout the dismantling process.In the literature, this process is referred to as interactive/dynamic attack generation.In Figure 4, we visualise the process of attacking the tram network based on interactive BETW; i.e., the values of BETW are recomputed after each node removal.BETWI always attacks the largest remaining GC and also chooses very vulnerable nodes in each step, making the attack rather disruptive to the network.In order to further address the problem of network dismantling, the complex network community has recently started to solve this problem more rigorously, by designing specific dismantling methods.We introduce a few of the relevant methods below.CI (collective influence [90]) can be seen as an extension of the degree-based attack, taking into account the so-called ball, i.e., the neighbors which are k steps away.Originally designed for efficiently attacking hierarchical networks, CI has now been used in several research studies on general graphs.
KSHELL (K-shell iteration factor [91]) is based on the coreness of nodes in a network [92].A large value indicates that the node has a strong ability to spread information.The algorithm combines shell decomposition and iterative node removal.
CHD (CoreHD attacks [93]) combine interactive degree and k-core [92] to achieve a decycling of networks.It iteratively removes the highest degree node among network 2-core graphs, until no 2-core graph remains, for then treating the remaining part through tree-breaking.
APTA [88] finds articulation points (or cut vertexes) in a network.In each step, the articulation points with the highest estimated impact are attacked first, based on an estimation of the largest size of the giant component after an attack.This process is repeated until the whole network is dismantled.During that process, if a network has no articulation point, the node with the highest degree is removed.
GND (generalised network dismantling [89]) computes a node sequence based on spectral properties of a novel nodeweighted Laplacian operator.It also supports nonunit costs for node weights.
In Figure 5, we compare the robustness curves of the introduced network metrics and dismantling methods, grouped by dataset.The curves have a rather large deviation, particularly for the tram and the bus network.In all cases, BETWI identifies the best attack, while EIG is usually the worst strategy.In order to further compare the quality and applicability of methods, Figure 6 reports the obtained R values and run time for all network metrics and dismantling methods in this study, grouped by dataset.We find that BETWI is always the method with the smallest R value but also takes the longest time to compute (note the y-axis is shown log-scaled).Interestingly, the method APTA is often 2-3 orders of magnitude faster than BETWI but still identifies attacks with quite good R values.GND is often much slower than APTA but has a smaller R value with the apparent exception on the logistics network.This example highlights that no single method is the best, except BETWI.Therefore, it should be understood that designing an effective attack for a network is a trade-off between expected quality and computation time.If the network is small, BETWI is still the best one can get.With an increasing size of the network, one should preferably select specific dismantling methods, such as APTA and GND.

Networks Are Robust against Random
Failures but Vulnerable to Targeted Attacks

Common Pitfalls and Misleading Interpretations.
Since the choice of a node sequence significantly affects the level of disruption to a network, it is common to distinguish two classes of disruptions: random failures and targeted attacks.While the former do not have a driving force controlling the node sequence (which is thus being completely random), the latter is specifically tuned for creating the maximum damage to a network.Existing studies often conclude with statements that the network is rather resilient to random failures, but more vulnerable to targeted attacks.These claims can be found on all kinds of transportation networks, including air transportation [94][95][96], railway-based systems [26,79,83,85,97], and others [98][99][100].We only highlight two representative statements here; others follow very similar structures: "This  scale-free structure has proved to be robust to random failure but vulnerable to targeted attack" [94]."This study indicates that the subway network is robust against random attacks but fragile for malicious attacks" [79].
The general conclusion of random failures being less hazardous than targeted attacks is inherent to the definition of both node orderings, given that targeted attacks are specifically designed for a network at hand.Otherwise, if a targeted attack, for instance, as induced by a specific network metric, is worse than a random failure strategy, this simply means that this metric does not represent the node importance very well for the specific network.

Recommended Solution.
The pure statement that a network is more vulnerable to targeted attacks does not provide real, novel insights.A more interesting question is how much more vulnerable a network is to a targeted attack, compared to a set of random failures.One way to measure this difference in vulnerability is to consider representative attacks obtained from an envelope of random attacks [101].Essentially, the idea is not to identify the (rather obvious) fact but quantify the difference in attack efficiency.In general, one can start with the R value   of the best targeted attack and compare it to the R value   of the representative random attack.The larger the   compared to   , the stronger the effect of using targeted attacks.Formally, we can introduce a measure defined as  =   /  .Moreover, it can be insightful to take the width of a random attack envelope into account, since random attacks on their own can still have a rather large variation in their induced R values.
In Figure 7, we visualise a set of random attacks for the six transportation networks in our study, as described in the Appendix.For each network, we generated 50 attacks randomly.Given these random attacks and their robustness curves, we compute the robustness envelope as follows: a minimum, maximum, and median curve are derived based on computing the corresponding aggregation function for all GC sizes at a given fraction of disrupted nodes.In addition, we plot the robustness curve as obtained by BETWI, the best known attacking strategy.In Figure 8, we show the results of comparing  of realworld networks and their equivalent ER random networks with the same number of nodes and links.The obtained value of Q can be found to the left of the random network peak, which can be explained by the fact that random network instances have no (topological)  properties which can be exploited by targeted attacks and, thus, targeted attacks in the real network are usually stronger.Yet, the distance from real-world network values and random network values vary significantly between the types.The logistics network, for instance, is much more vulnerable to targeted attacks than its random counterpart.Intuitively, the presence of a few hubs makes the network much more vulnerable to hub-targeted attacks, compared to random network instances.For the subway network, on the other hand, targeted attacks are almost as strong as in the random counterparts.This means that the subway network does not have an inherent structural property that can be further exploited by targeted attacks.

Discussion and Conclusions
In this work we have revisited some common problems that can be found in papers that apply complex network theory to the study of the topology of transportation systems, analysed their impact, in terms of how our understanding of the underlying system can be misleading, and presented a set of solutions.Four specific topics have been covered: (1) One of the most important topological properties of network is scale-freeness, i.e., the fact that the degree distribution of nodes follows a power law.Such theoretical model has been the foundation of many studies in complex network theory, and there has been a lot of interest in assessing whether real-world networks, including transportation ones, actually follow it.Yet, assessing the scale-freeness is not a trivial task, as it requires both large enough networks and the application of suitable statistical tests.We presented a review of some common errors and some potential solutions, including an analysis of which statistical tests are actually tailored to this problem (2) Beyond scale-freeness, the first step in the analysis of a complex network is usually its description through a series of topological metrics, i.e., metrics assessing some aspects of its structure.An important problem stems from the fact that such topological metrics are usually influenced by the number of nodes and links in the network; if these are not taken into account, comparing different networks may yield unreliable results.We presented some examples of misinterpretations of network metrics and suggested a simple solution based on the creation of null models (3) We would like to increase awareness of the fact that network metrics do not lead to optimal attacks.In fact, there is no single metric which always outranks  distribution of degrees.At the same time, we reported that theoretical dismantling strategies, developed on the scale-free model, may not efficiently work on real networks.One may thus ask what is the effect of not following a perfect scale-free distribution, or, in other words, what are the consequences of having real, as opposed to theoretical, networks (2) Metrics normalisation required the development of suitable null models, able to create networks without any specific structure, but still constrained by the characteristics of the system under study.A completely random network may not be a good null model for the airport network, as very short flights have no economical meaning.This has been partially solved in other scientific fields, for instance, on protein networks [103], and should probably be tackled also for transportation systems (3) Most transportation studies on complex network robustness are performed on undirected, unweighted networks with unit costs for dismantling nodes/links.Clearly, all these assumptions are simplifications in order to make computation feasible and facing a limited amount of available data.We foresee the need for a generalised transportation network robustness framework, which, given a variable set of data (passenger data, schedules, etc.), computes the a realistic measure of robustness for a transportation system.While there exist a number of studies tailored specifically for regional transportation system at high level of detail, there is no agreement on a common model for transportation network robustness.Such a benchmark model would help to push our understanding of network robustness further and eventually improve our critical transportation infrastructure As a final note, we would like to highlight that the same caution, one should devote to the previously discussed pitfalls, should also be applied to avoid misleading generalisations.Any network method being applied to a transportation problem is very much dependent on the available data and the problem at hand.If one should carefully investigate the applicability of previously published methods, rather than simply borrowing them from other disciplines, the solutions here proposed should similarly be judged according to the context.To illustrate, some theoretical models may require an exact scale-free distribution to yield meaningful results, and the characteristics of a null model should be consistent with (and adapted to) the system under analysis.In synthesis, it is important to keep in mind that "one size does not fit all".
data represent the future planning of the service and are therefore valid from February 2018 to the end of 2018.The file provides information on stop locations, connections, and schedules for all transportation modes in Berlin.We have extracted all bus routes (GTFS code route type 700) and created a complex network with stations being nodes and two nodes being connected if there is at least a bus service between them.The obtained network consists of 12,272 nodes and E=19,584 links; we have converted it to an undirected network for the analysis in our study.The network is visualised in Figure 9(b).Light rail network: the third case study represents the light rail network in the greater area of Berlin.The data were downloaded from the official website of operator VBB (see above).We have extracted all light rail routes (GTFS code route type 109) and created a complex network with stations being nodes and two nodes being connected if there is at least one light rail service between them.The obtained network consists of 166 nodes and 184 links; we have converted it to an undirected network for the analysis in our study.The network is visualised in Figure 9(c).
Logistics network: the Australia Post problem (http:// people.brunel.ac.uk/∼mastjjb/jeb/orlib/files/phub1.txt) is a standard dataset for testing the efficiency of hub location solvers.We have downloaded the dataset and computed an optimal assignment for the incomplete hub location problem with 5 hubs, fixed costs of [1000, 1000, 1000, 1000] and variable costs of [0.10, 0.04, 0.02, 0.04], using an enhanced Benders decomposition [104].The result is an optimal assignment of hub links, access links, and direct links, minimising the transportation costs in the network.The network is visualised in Figure 9(d).
Subway network: the subway network in the greater area of Berlin was downloaded from the official website of operator VBB (see above).We have extracted all subway routes (GTFS code route type 400) and created a network in the customary way.The obtained network consists of 163 nodes and 165 links; we have converted it to an undirected network for the analysis in our study.The network is visualised in Figure 9(e).
Tram network: the tram network in the greater area of Berlin was downloaded from the official website of operator VBB (see above).We have extracted all tram routes (GTFS code route type 900) and constructed the network accordingly.The obtained network consists of 420 nodes and 489 links; we have converted it to an undirected network for the analysis in our study.The network is visualised in Figure 9(f).

Figure 2 :
Figure 2: An example of a network attack.The process starts with a target network (a), where we want to attack two nodes.In (b) we show an optimal disruption when being allowed to disrupt two nodes.The choice of  and  reduces the GC size from 13 to 5. In (c) we show a disruption induced by node failures of  and .In (d) we show a disruption induced by node failures of  and ; the majority of the network is still functional, as the GC size is reduced from 13 to 11 only.

8 Figure 3 :
Figure 3: Evolution of an attack generation on the tram network with DEG.The giant component is highlighted in red and bold.When attacking the 8th node, the size of the giant component is hardly changed.

8 Figure 4 :
Figure 4: Evolution of an attack generation on the tram network with BETWI.The giant component is highlighted in red and bold.When attacking the 8th node, the size of the giant component is already reduced to less than 20% of the original size.

Figure 5 :
Figure 5: Comparison of robustness curves for different attack heuristics on six transportation networks, as described in Appendix.

Figure 6 :
Figure 6: Comparison of attack heuristics regarding R value (x-axis) and Run time (y-axis) on six transportation networks, as described in Appendix.

Figure 8 :
Figure 8: Comparison of difference between targeted/random attacks in real networks and their random counterparts (each of the 50 networks with the same number of nodes and links).The blue dashed curve is the kernel density estimation of TargetRandomQuotient in the random networks and the red vertical line indicates the TargetRandomQuotient for the real network.

Figure 9 :
Figure 9: Graphical representations of the six transportation networks considered in this study, including both node-based (airport and logistics) and link-based networks (bus, light rail, subway, and tram).

Table 1 :
Metrics of goodness-of-fit for the degree distributions of the two networks depicted in Figure1.

Table 2 :
Examples of topological metrics calculated over three networks described in Appendix.The three rightmost columns, respectively, report the raw metric value (i.e., as calculated on the original network), the average and standard deviation of the metric obtained in random equivalent networks, and the final Z-score.