Weighted Complex Network Analysis of Shanghai Rail Transit System

With increasing passenger flows and construction scale, Shanghai rail transit system (RTS) has entered a new era of networking operation. In addition, the structure and properties of the RTS network have great implications for urban traffic planning, design, and management. Thus, it is necessary to acquire their network properties and impacts. In this paper, the Shanghai RTS, as well as passenger flows, will be investigated by using complex network theory. Both the topological and dynamic properties of the RTS network are analyzed and the largest connected cluster is introduced to assess the reliability and robustness of the RTS network. Simulation results show that the distribution of nodes strength exhibits a power-law behavior and Shanghai RTS network shows a strong weighted rich-club effect.This study also indicates that the intentional attacks are more detrimental to the RTS network than to the random weighted network, but the random attacks can cause slightly more damage to the random weighted network than to the RTS network. Our results provide a richer view of complex weighted networks in real world and possibilities of risk analysis and policy decisions for the RTS operation department.


Introduction
There is a rapidly growing literature on the complex networks present in transport systems and complex network analysis is a useful method to analyze the structure of transport systems.For example, studies of worldwide airport networks have shown small-world property and exhibited heavy tailed power-law distributions [1].For the public transportation systems in Poland, various network properties, such as the distribution of degree and clustering coefficient, have been analyzed [2].Further, the national highway network of Pakistan has been investigated with weighted complex network analysis of travel routes on the network [3].Rail transit systems are in essence physical networks that are composed of stations or stops all linked by rails.Latora and Marchiori [4] found that Boston public transportation system exhibits the small-world behavior.Angeloudis and Fisk [5] studied the world's largest subway systems and found that systems with substantial shared track are less robust than dedicated line systems of similar size.Lee et al. [6] analyzed statistical properties and topological consequences of the Seoul subway system and found that the flow weight distribution exhibited a power-law behavior.Soh et al. [7] contributed a complex weighted network analysis of travel routes on the Singapore rail and bus transportation systems.Zhang et al. [8] summarized the universal characteristics of the urban rail transit networks.Besides topological characteristics of networks, the reliability and robustness of metro networks were also widely studied.By looking at 33 metro systems in the world, Derrible and Kennedy [9] analyzed the complexity and robustness of metro systems and provided insights/recommendations for increasing the robustness of metro networks.Based on complex network theory, Zhang et al. [10] studied the connectivity, robustness, and reliability of the Shanghai subway network.De-Los-Santos et al. [11] provided passenger robustness measures for a rail transit network.
Nevertheless, these literatures, from a complex network perspective, focus more on topological features than dynamic traffic flow.The quantity of traffic in large transport infrastructures is fundamental for a full description of these networks [1].Therefore, this paper aims at providing a richer and novel view of statistical properties of weighted complex networks.Both the topological and dynamic characteristics can be investigated according to complex network theory in this paper.In addition, the largest connected cluster is introduced to assess the reliability and robustness of weighted complex networks.By proper methods, it is also possible to explore the correlation between passenger flows and the topological structure of rail transit network, thereby providing scientific theoretical guidance for urban rail transit planning, design, and management.

Weighted Networks Data
Shanghai RTS network consists of 286 nodes denoting stations and 317 edges accounting for a link connecting two nearest stations.The average degree of the network is ⟨⟩ = 2/ = 2.22, while the maximal degree is 8.As already observed in previous literatures [8,10], the topology of the network exhibits both scale-free and small-world properties.Datasets that are provided by the Shanghai Shentong Metro Company list the hourly in and out passenger flows for each RTS station and passenger flows between adjacent stations.In this study, passenger flows during morning peak hours from 7:00 to 9:00 in a typical weekday are analyzed, during which the highest volume on a weekday could be observed.
Due to the method of data capture, it is important to note that the paper is not only investigating the topological and functional properties of the RTS network, but also the passenger flows between the different stations.In this work, it is assumed that typical travel was bidirectional, and hence the weight   of one edge between a pair of nodes (stations)  and  is defined to be the sum of passenger flows in both directions and   =   .

Weighted Network Analysis of Shanghai Rail Transit Network
In this section, we present a topological and dynamical analysis of the Shanghai RTS network.
But it is more appropriate to use the physical distance   rather than the network distance   in measuring the real network efficiency .In terms of the physical distance, the characteristic path length, diameter, and network efficiency for the weighted network are given by  = 24.49km,  = 116.20 km, and  = 0.0352, respectively.

Degree and Strength Distribution.
In a topological network, for a given node , its degree is the number of edges shared with other nodes and defined as   = ∑     .In a weighted network, a more meaningful measure of the network properties is obtained by introducing strength   , defined as This quantity   combines node degree with edge weight and reflects centrality of a node  in the weighted network.
The probability distribution of () exhibits heavy tailed behavior and shows similarities with the degree distribution () (see Figure 1).Meanwhile, the strengths of stations appear scale-free (indicating the existence of hub nodes with very high traffic) and follow a power-law distribution with  ≈ 0.487.
When each edge in the network is assigned a weight, researchers naturally would like to know the correlation between weight and topology structure of the network.In order to shed more light on the relation between the node strength and degree, the dependence of the strength   on degree   is investigated.As it can be seen in Figure 2, the distribution of the average strength () as function of the degree  of nodes can be well approximated by the power-law behavior with an exponent  ≈ 1.1.This reveals that the strength of nodes is positively associated with the degree and coincides with the fact that the more the connections a station is linked to, the more the traffic it handles.This observation also implies that the rail transit traffic grows much faster than the number of connections.But it should also be noted that in the Shanghai RTS network the largest degree node is the Century Avenue, while the highest strength node is the People's Square (see Figure 3).What is more, although many nodes share similar degree, the traffic handled by each rail station may differ significantly.In consequence, the dynamical properties of a network may differ significantly from its topological properties.To summarize, the node strength and degree indicate the importance or connectivity of node in the network from the different angles, and the node strength is more appropriate to describe the real RTS network.In addition, the strength of nodes during peak hours provides a measure for the characteristic capacity of the facilities near a station. of the network.It is defined for any node  as the fraction of connected neighbors and can be expressed by the following equation:

Topological and Dynamical
where   refers to the number of edges between the neighbor nodes of node .The average clustering coefficient   is averaged over clustering coefficient of all nodes in the network and can be expressed by It is important to note that   is defined solely on topological grounds.Edge weights and their correlations may change our view of the hierarchical and structural organization of the network (see Figure 4).In order to solve the previous incongruities, Barrat et al. [1] introduce the weighted clustering coefficient    which takes into account the weights of edges: Similarly, the average weighted clustering coefficient is given by If the weights are completely uncorrelated in the network,   =  and   () = () [13].However, weights in realworld networks are often correlated, leading to two possible situations.When ⟨  ⟩ is larger than ⟨⟩, triangles in the network are more likely connected by edges with larger weights.Conversely, when ⟨  ⟩ is smaller than ⟨⟩, the topological clustering is generated by edges with smaller weights.
In this case, the weighted clustering coefficient of the Shanghai RTS network is ⟨  ⟩ = 0.0024, approximately twice as large as their topological measures (see Table 1).As ⟨  ⟩ is larger than ⟨⟩, it can be concluded that triangles are more likely constructed by edges with larger weights.However, although Shanghai RTS network features the longest mileage in the world, ⟨⟩ and ⟨  ⟩ of the Shanghai RTS network are significantly smaller than other international metropolises such as Tokyo, New York, and London [8].This result suggests that connectivity of the network should also be considered when the RTS networks are planned and designed.

Degree-Degree Correlation.
Degree-degree correlation demonstrates the extent of a node's degree related to the average degree of its neighbors and reflects the node's connection preference [14].If high-degree nodes in the network tend to link with each other, the network is considered to be assortative.On the contrary, if high-degree nodes tend to connect with low-degree nodes, the network is a disassortative network.The average nearest neighbors degree is one of the common indexes in measuring degree-degree correlation and is defined as Further information can be gathered by inspecting the function  nn (); it represents the average degree of all degree nodes   (neighbors of all nodes with -degree) and can be expressed as If  nn () is an increasing function of , the network is assortative.If  nn () is a decreasing function of , the network is disassortative.In weighted networks, the weighted average nearest neighbors degree is defined by This quantity measures the affinity of a node to connect to high or low-degree neighbors.This depends on the edge weights between itself and neighbors.If   nn, is larger than  nn, , the heavily weighted edges connect to the large-degree neighbors.Conversely, the edges with the smaller weights are pointing to the neighbors with lower degree when   nn, <  nn, .
Returning to the Shanghai RTS network, Figure 5 shows that both the weighted network and the unweighted network are definitely assortative at least up till  = 6.It is surprising to find that both   nn () and  nn () decrease at  = 8.After conducting a deep research, we find that the network has one and only one node with degree  = 8.Therefore, from the overall trend, the Shanghai RTS network can be considered Figure 6: Schematic representation of a weighted network [12].to be assortative.This finding is in line with findings in other studies mentioned above [1].Also, since   nn, >  nn, in the whole  spectrum, the heavily weighted edges are linked to the large-degree neighbors; in other words, edges with the larger flow of passengers pass through more well-connected locations.

Topological and Dynamical Rich-Club Coefficient.
Richclub coefficient (RCC) is an important quantity and introduced as a measure of the interconnectivity between hub nodes.For the set of  > rich nodes whose degree is larger than , the RCC is defined as where  > represents the number of edges between  > rich nodes in the club.In other words the RCC () measures whether or not connections are established among rich nodes.If there are no edges between rich nodes, then () = 0, whereas () reaches the value of 1 when all the possible edges are present.However, it is easy to find that, even without any correlation, one edge is more likely to be shared between two rich nodes rather than two low-degree nodes.In particular, Colizza et al. [15] find that in an uncorrelated graph the RCC increases as  2 for large .In order to overcome this problem, they compared () measured on the real network with the corresponding  null () obtained from an appropriate null model and proposed the ratio Similarly, Opsahl et al. [12] extended the definition of () and () for weighted networks.They made a generalization by defining a rich-club coefficient for the weighted case as where  is a richness parameter, such as strength. > is the sum of weights between nodes whose richness is larger than  and  > is the number of nodes whose richness is larger than . rank  is the th weight in  > strongest links within the whole network.For example, Figure 6 where   null () refers to the weighted rich-club effect assessed on the proper null model.Weight reshuffle is used to produce the null model in this paper; thus the topology of the null model remains intact, and weights are randomly and globally redistributed over the links of the network [12].When   () > 1, the original network has a positive weighted rich-club effect, with rich nodes being intertwined with other rich nodes more tightly than randomly expected.In contrast, if   () < 1, the links among rich nodes are weaker than expected from randomness.In this case, the richness parameter  is defined as the strength of nodes in order to examine whether the active nodes control the exchange of passenger flows.As it can be seen in Figure 7, the weighted rich-club ratio   grows remarkably as a function of the strength of stations.Therefore, the RTS network shows a strong weighted rich-club effect.This finding agrees with previous studies that reported the weighted rich-club effect in the worldwide airport network [12].Active nodes that handle heavy passenger flows preferentially direct their efforts towards one another, and this tendency becomes more pronounced as the strength of nodes in network increases.Connections among hub stations in the RTS network are characterized by large passenger flows.

Reliability Analysis of the Shanghai RTS Network.
The reliability of Shanghai RTS network is investigated in this section.Several topological parameters can be applied to  evaluate the reliability and vulnerability of the RTS network, such as the largest connected cluster, network efficiency, and network size.In this paper, the largest connected cluster (LCC) is adopted to illustrate the performance changes of RTS network [16,17], which indicates the connectivity of network.In the unweighted networks, the largest connected cluster is defined as follows: where  is the number of nodes on the largest connected subgraph after attacks and  0 is the number of nodes on largest connected graph of the initial network.In order to assess the reliability and robustness of the weighted networks, the strength  is integrated with the largest connected cluster for the weighted case and LCC  is defined by the following equation: where  is the sum of strength of nodes on the largest connected subgraph after attacks and  0 is the sum of strength of nodes on largest connected graph of the initial network.
It is generally known that when the RTS network is attacked by intentional attacks or random attacks, this will result in failure of some nodes and their connections, and the network might be compromised and even breaks down.If the failure ratio of nodes exceeds the critical threshold, the network will disintegrate into smaller subnetworks and become disconnected.Cohen et al. [13,18] discussed the critical fraction   of the Internet when subjected to random removal and intentional removal, respectively.Zhang et al. [10] had done similar researches for the subway network.Combining previous literatures and characteristics of the RTS network failure, the critical threshold   of the fraction of removed nodes is defined as follows.When the nodes are removed from the network one by one and the removed fraction  <   , the network has troubles but cannot collapse; otherwise the network almost breaks down when  ≥   ; in this case, the largest connected cluster for the weighted network is less than 0.05.In this section, the weighted largest connected cluster LCC  is considered as a function of friction of removed nodes .
In this paper, the critical fraction   of Shanghai RTS network is discussed and compared to random weighted network produced by weight and link reshuffle (i.e., the ties of the network, with their attached weights, are reshuffled in a way that the degree distribution () is preserved [19]).Figure 8 presents the performance changes of the largest connected cluster of the Shanghai RTS network under different attack strategies.Here, WR represents random attacks on the RTS network, WI represents intentional attacks on the RTS network, RWR denotes random attacks on the random weighted network, and RWI denotes intentional attacks on the random weighted network.Meanwhile, the intentional attacks indicate that the largest strength nodes are removed from the network one by one.From Figure 8, the performance and critical thresholds of the RTS network under different attack strategies can be obtained.The critical thresholds for different network failures are   (WR) = 0.5664,   (WI) = 0.2203,   (RWR) = 0.5210, and   (RWI) = 0.2517.From the above critical thresholds of the fraction of removed nodes of the RTS network, it is obvious that the intentional attacks can cause more damage than random attacks both to the RTS network and to random weighted network.Furthermore, the intentional attacks are more detrimental to the RTS network than to the random weighted network, but the random attacks can cause slightly more damage to the random weighted network than to the RTS network.

Conclusions and Future Research
In this paper, Shanghai RTS network of China is investigated by using complex network theory.The topological properties and the connectivity, robustness, and reliability of the RTS network are studied from a complex weighted networks perspective.
Simulation results show that the behavior of strength follows a power-law distribution and exhibits heavy tailed behavior; that is, the network possesses hub nodes with high traffic.Moreover, the dynamical properties of a weighted network may differ significantly from its topological properties.Although many nodes possess the same degree, the traffic handled by each rail station may differ significantly.
The weighted clustering coefficient is a measure of the local cohesiveness and provides global information on the correlation between weights and topology by comparing them with their topological clustering.As ⟨  ⟩ is larger than ⟨⟩, it can be concluded that triangles are more likely constructed by edges with larger weights.However, compared with other international metropolises, the local connectivity of the RTS network is very poor.Along with the weighted clustering coefficient, the weighted average nearest neighbors degree is also introduced in this paper and this study finds that the heavily weighted edges are linked to the large-degree neighbors.
Consistently with previous research in worldwide airport network, the RTS network has a positive weighted richclub effect, indicating that active nodes handling heavy passenger flows are more likely to direct their efforts towards one another.Furthermore, this tendency becomes more pronounced as the strength of nodes in network increases.
The reliability of Shanghai RTS network is investigated in the last section and the findings disclose that the damage caused by intentional attacks is larger than random attacks both to the RTS network and random weighted network.Moreover, the intentional attacks can cause more damage to the RTS network than to the random weighted network, but the reliability of Shanghai RTS network is slightly higher than the random weighted network subjecting to the random attacks.
Complex weighted analysis is a powerful tool for understanding complex architecture of real weighted networks.Indeed, the analysis of the weighted quantities and network reliability provide possibilities for risk analysis and policy decisions through learning structural organization of the network.From the previous literature and research, it is known that there are many other attack strategies and topological parameters for evaluating the reliability and robustness of networks.It would be interesting to study the performance changes of the weighted network under different attack strategies.Another possible extension would be to conduct a comparison analysis of topological and dynamic properties for the weekend and weekday RTS networks and this study may get some interesting results.

Figure 1 :Figure 2 :
Figure 1: Degree and strength distributions for the RTS networks.

Figure 3 :
Figure 3: Distribution of strength as a function of the degree.

k 2 Figure 4 :
Figure 4: Examples of local configurations whose topological and weighted quantities are different [1].

Figure 5 :
Figure 5: Degree-degree correlation of the RTS network.

Figure 7 :
Figure 7: Weighted rich-club effect in the RTS network.

Figure 8 :
Figure 8: Performance changes of the largest connected cluster under different attack strategies.

Table 1 :
Statistical properties of the Shanghai RTS network.

Table 1
The characteristic path length  of the unweighted network between two nodes V  and V  is defined in terms of the network distance, which represents the minimum number   of links necessary to go from node V  to node V  .