Robustness Measure of China’s Railway Network Topology Using Relative Entropy

This study focused on China’s railway network topology issue from a robustness measure perspective. Relative entropy is used in the study as a measurement of robustness of railway network topology. It is found that the entropy-based measure provides more informative analyses compared to a traditional graph measure. The results indicate that the railway network in the 12th five-year plan has improved robustness when compare to 2008 with respect to deliberate and random attacks.


Introduction
Since China's first railway was built in 1865, there have been tremendous achievements in its development over 145 years.Many new stations and lines were constructed; a huge network has been formed.The planning of a railway network should not only be evaluated from external indexes, but also analyzed on its overall performance.The research on railway network topology structure takes the railway network as a complicated system and studies its topology structure using statistical methods, to provide a theoretical basis and foundation for network planning, construction, and scientific research.The robustness of network topology is one of the most important and basic features of complicated systems.
Research on robustness is based primarily on graph theory, which includes average shortest path, node degree, aggregation coefficient, and the number of interfaces.But these measures of robustness do not offer adequate amounts of information.This article uses relative entropy as the robustness measure of China's railway network topology.It also analyzes the relationship of relative entropy and ordering employing information theory methods from a systematic view.The robustness of China's railway network in 2008 and in the 12th five-year plan was analyzed.
At present, in robustness research of network topology, scholars usually take definitions from graph theory as measures of robustness.Albert et al. [1] compared the connectivity of random networks and scale-free networks (a typical complex network) and further pointed out that the robustness of the two networks is significantly different under random attacks and malicious attacks.Watts [2] and Strogatz [3] put forward the concept of small world networks.Cohen et al. [4] found the robustness of an Internet network under random attacks and malicious attacks is based on Percolation theory.Bollobas and Riordan [5] made a mathematical analysis for robustness complex networks based on random graph theory.And Bars et al. [6] proposed a robustness measure in accordance with the Gastinel-Kahan theorem investigating stability conditions of discrete perturbed closed-loop systems.Rossi [7] presented new results for assessing the robustness of a configuration for multipurpose machines.The robustness measure of a configuration is returned by assessing the minimum magnitude of disturbances affecting the forecast demand that may lead to breaking the deadline provided by the decision maker.Border et al. [8] found that the macroscopic structure of the Web is considerably more intricate than suggested by earlier experiments on a smaller scale.
The following research used graph theory and measured the invulnerability of the network topological structure from the angle of connectivity.Holme et al. [9] used the overall efficiency  and max connected subgraph  to measure the network performance after an attack.Moreno et al. [10] found the test of strength for complex networks such as the Internet was more stringent than others recent tests, like the random removal of nodes.Xiao and Dong [11] defined the connectivity of , and it represents average communication paths between all nodes of the group, which is to measure and evaluate the survivability of network topology.Lu and Dong's [12] domain is the topology of the network invulnerability  as the total number of paths  which may establish within a network group divided by the pathways  between the groups.Criado et al. [13] considered it necessary for the network survivability index to have some important characteristics of the definition of network survivability.Then, they gave the invulnerability index a function definition and two kinds of survivability functions.
Railway networks have been hot topic system studies and measure methods.Banik and Dasgupta [14] deal with the use of Petri nets in modeling railway networks and designing appropriate control logic to avoid collisions.Feng et al. [15] used a spline interpolation method, a numerical differential five-point formula and the method of least squares to solve for a synergistic coefficient.Tomoeda et al. [16] presented a rescheduling method of homogenization to alleviate congestion of crowded train.Khemakhem et al. [17] provided an accurate estimate of schedule robustness, and they introduced surrogate measures.Santiago et al. [18] found through empirically analyzed that the SDH network operated by Telefónica in Spain shares remarkable topological properties with other real complex networks, such as the Internet network.Wang et al. [19,20] analyzed the methods for the safety of traffic systems.
For years, scholars focused on using entropy to measure complex systems.Xu and Hu [21] proposed a new degree dependence entropy (DDE) descriptor to describe the degree dependence relationship and corresponding characteristics.Anand and Bianconi [22] defined the Shannon entropy of a network ensemble and proposed that it relates to the Gibbs and von Neumann entropies of network ensembles.
The paper will concentrate on the domain of proposing a new robustness approach for measuring China's railway network.A model based on a simulated Chinas railway network will be described.Results from an implementation with a random attack network will be reported.

Model of Railway Geography Network
From a geographical point of view, the main elements which constitute a railway network are the railway station and the railway line.Therefore, the actual railway network can be abstracted as a geography network by viewing the railway station as nodes and the railway line between each node as a side.With regard to the construction of the railway geography network, the following assumptions are needed: (1) the railway line is bidirectional; the railway geography network is undirected as directions of its sides are not taken into account; (2) the importance of the railway station and the railway line is relative; that is to say, the railway geography network is an unweighted network; (3) the carrying capacity of the railway line will not be taken into consideration; (4) multiple sites in the same city are regarded as a unified one (e.g., Beijing railway station, Beijing west railway station, Beijing south railway station, and Beijing north railway station in Beijing, are all seen as "Beijing").
According to this assumption, we can build a railway geography network.The network is an undirected and unweighted diagram connected by the railway station and the railway line, which can be described as  = (, ), of which  represents a collection of various railway stations and  represents a collection of railway lines connecting each railway station.
The following takes Urumqi station and its nearby stations as an example to illustrate this railway geographic network model.Figure 1 is the actual railway network.The railway geographic network can be worked out after it is abstracted, and a sketch can be derived.That is Figure 2, where  = (, ).From Figure 2, we can further obtain its adjacency matrix, shown in Figure 3.
As there are five thousand stations of different levels or so across the country, it is hard to obtain so much data.This paper researches some of the main sites according to China's railway line diagrams published by the Ministry of Railways in December 2008 and in the 12th five-year plan released in October 2012.Some main sites are to be taken to establish the 2008 railway geography network and the 12th five-year plan railway geography network.

Definition of Relative Entropy.
Entropy is a measure of the uncertainty of random variables, and it is also a measure of the amount of information required to describe random variables in an ordinary sense.
Relative entropy is a measure of the distance between two random distributions.In statistics, it corresponds to the logarithm expectation of likelihood ratios.The relative entropy or Kullback-Leibler distance of two probability density functions () and () is defined as follows: In accordance with the intrinsic meaning of relative entropy, relative entropy can be viewed as the ordering distance between the railway network and a standard network.Therefore, as long as the completely disordered network is determined, the ordering distance between the railway network in 2008 and a completely disordered network can be obtained by relative entropy.Then we can get quantitative data on the railway network robustness.Therefore, the relative entropy can be seen as a measure of the ordering of the railway network in 2008.The robustness of the railway network can be obtained by analyzing the variation of relative entropy when the railway network is attacked.

Railway Network with a Completely Disordered Distribution-Maximum Information Entropy.
Next, the completely disordered distribution of the railway network will be determined.Information theory only gives the solve function for the maximum entropy.This paper will take China's railway network as a complex system.Topology information entropy is at its maximum when the railway network is in a uniform structure at the moment the ordering of the system is worst; when the railway network is configured with a star structure, topology information entropy is at its minimum, and here the ordering of the system is best.The construction of the railway network is a process that moves from disorder to order.Therefore, the condition when ordering is at its worst is the worst case for the railway network.This paper selects maximum information entropy as the complete railway network with a disordered distribution.
The relative entropy is the ordering distance between the railway network being studied and the railway network with a completely disordered distribution.This means that the larger relative entropy is, the more ordered the railway network is.
When the railway network is completely disordered, all nodes have the same degree distribution 1/; that is to say, there is (  ) = 1/ for any node   ∈ ,  = 1, 2, . . ., .At the moment the maximum information entropy is the following:

Measure of Railway Geography Network Robustness.
Accepting the definition of relative entropy as a measure of robustness requires a distribution as a basis.This paper is intended to employ node degree distribution.Degree is defined as the number of adjacent sides of the nodes; namely   = ∑    = ∑    , which represents the total number of the sides connected to  points.Degree distribution function (  ) represents the probability that any node's degree is .In another words, degree distribution means the proportion of nodes with degree of  among the total number of nodes in the network.As an important geometrical property of the network, degree distribution could describe some characteristics of the network.
If the degree distribution density function of the railway network is expressed as (), then the robustness measure  can be obtained from the expression (1) as follows:

Proof of Measure Testability of China Railway Geography Network
To define a new measure, it is necessary to analyze whether or not it is a measurable function.Next, analysis is to be conducted on whether the relative entropy function is a measurable function or not by using the Lebsegue measure theory in the function of read variable.

Definition and Theorem.
First, a few basic definitions and theorems.
Definition 1 (measurable set).Set  ⊂ ; if any collection  ⊆ , there is Then  is Lebsegue measurable or measurable in short,  * is the outer measure,   is supplementary set of , and  is the whole measurable set.
Definition 2 (measurable function).Set  is a measurable set, and set  is a function from  to .If any section  ⊂ , there is Then  is Lebsegue measurable, or measurable in short.
Definition 3 (measure).Measure is a set of functions ( * ), defined in the collection of sets constituted by the subsets of given set Ω, valued in the expansion of real functions  ∪ {±∞} and meeting certain conditions.These conditions should be consistent with the following properties.
(3) Countable additively: for any non overlapping row Theorem 4. Set (, ) is a measurable space, and set  :  →  * is a function defined in the measurable space.Then the fact that  is a measurable function and the fact that for any  ∈ ( 1 ),  −1 () ∈  is equal.

Construction of Measurable Sets
Step 1.Since the relative entropy is actually a probability distribution function of a set of random variables, a probability space (Ω, , ) is constructed first.When it comes to the Chinese railway network topology, the probability space can be built based on the node degree distribution.Assuming the selection of degree is the event (  ,  = 1, 2, . . ., ),  is maximum degree of nodes, sample space is as follows: The value of degree is derived from a function () on the sample space.From this, we can use the subsets of the sample space to make a -domain .Proportion of degree   is the probability that a node's degree is (  ).For this reason, set ({  }) =   ,  = 1, 2, . . ., ; if  ∈ ( ⊂ Ω), then () = ∑   ∈ ({  }).So a probability space (Ω, , ) is set for random phenomena as the degree distribution of Chinese railway network topology.
Step 2. According to the probability distribution and the definition of probability measure, a new probability space (, ,   ) is induced in the above random variables, of which  ∈  ( refers to Borel collection); there is   () = ( −1 ()).Function   (⋅) is a probability distribution of random variable  and distribution in short.According to the real variable theorem, this set function   (⋅) is a probability measure on the set functions .
Here, it is required to know that  and  are both the set of events, and the difference between them is that  refers to setting a few values for node degrees, while  refers to when the value of several node degrees are real numbers.In other words,  is a simple event and  belongs to the set of real numbers.However, all real numbers are basic events, and all Borel sets are events; hence,  is also an event.
Thus, a new probability space (, ,   ) is induced from the probability space (Ω, , ).  (⋅) represents a probability measure of event .
Step 3. As the relative entropy is the probability distribution function of the random variable, the relative entropy function  = ∑ ∈ () log(()) can be written as  = (  ).

Proof of Testability.
Next, we will prove that the relative entropy is a measurable function.
At this point, it is proven that the relative entropy is a measurable function.

Analysis of Railway Network Robustness in ''The 12th Five-Year Plan''
To test and verify the utility of the new robustness measure, two networks were selected.The first one is a circuit diagram of China's railway network in 2008, in which 313 nodes and 472 sides were selected; the other is a circuit diagram of China's railway network in "the 12th five-year plan", in which there are 420 nodes and 753 sides selected (Figure 4).By comparing the relative entropy changes of railway network in "the 12th five-year Plan" and the railway network in 2008 under two different types of attacks, we can get to know each configuration's robustness and their pros and cons.The superiority of the relative entropy measure can be shown by comparing relative entropy with the clustering coefficient of graph theory.Clustering coefficient is defined as follows: if the node  is collected to other nodes by   sides, then there are   (  − 1)/2 sides at most between the   sides.So, the clustering coefficient   is a ratio of the actual number of sides between the   sides and the maximum number of possible sides   .  can be described as   = 2  /  (  − 1).

Initial State.
We can determine that the relative entropy of China's railway network in 2008 is 0.6501 and in "the 12th five-year plan" is 0.6931 by inputting the initial data of the railway geography network in 2008 and the "12th five-year plan" to the expression (6).The calculated average clustering coefficient in 2008 is 0.0699 and in "the 12th five-year plan" is 0.0918.From this, ordering and aggregation degrees of node in "the 12th five-year plan" are higher than in 2008.It is observed that the conclusion derived from relative entropy and average clustering coefficient is consistent.

Deliberate Attack.
In this part, we'll analyze the robustness of the railway geography network in 2008 and "the 12th five-year plan" under deliberate attack.Deliberate attack refers to those attacks on important sites on purpose.It shows up in the deletion of node degrees from largest to smallest on the geography network model.The result is shown in Figure 5.
From the point of view of relative entropy, similarities of the two railway networks under deliberate attack are as follows.
(1) Along with increasing attack strength, the relative entropy of railway network declines, and the ordering gets lower and lower.(2) The fact that relative entropy fluctuates is because that deliberate attack is from larger node degree to smaller node degree.When the nodes of large value are deleted, the railway network will be divided into several smaller networks.The relative entropy is a superposition of ordering of these small networks, so there will be a rally.
(3) When deliberate attacks increase to a certain extent, the railway will be completely disordered.And when the node degree is 1, each node will be of equal status, and then the network will remain in a completely disordered state until the attack is over.
Compared to the railway network in 2008, the railway network in "the 12th five-year plan" has the following advantages.(1) During the early stage when a deliberate attack is about 10 percent, the relative entropy is decreasing slower.But the railway network in "the 12th five-year plan" still has higher ordering and stronger robustness when the deliberate attack is relatively small.(2) The change in relative entropy of the railway network in "the 12th five-year plan" is more stable than that in 2008.(3) The railway network in "the 12th fiveyear plan" becomes quickly disordered when the deliberate attack is at 70 percent.But the railway network in 2008 becomes quickly disordered when the deliberate attack is at 50 percent.(4) The railway network in "the 12th five-year plan" becomes completely disordered when the deliberate attack is at 90 percent.But the railway network in 2008 becomes completely disordered when the deliberate attack is 80 percent.
From the point of view of the average degree of aggregation, the average degree of aggregation in 2008 and "the 12th five-year plan" is declining under deliberate attack.When attacks reach a certain extent, the clustering coefficient is 0. The decreasing trend of average degree of aggregation in "the 12th five-year plan" is more stable than that in 2008.Clustering coefficient is more durable than that in 2008.

Random Attack.
After a description of a deliberate attack, analysis will be given on the robustness of railway geography network in 2008 and "the 12th five-year plan" under random attack.Random attack refers to aimless attacks on sites.In the geography network model, a random attack deletes node degrees randomly.The result is shown in Figure 6.
From the point of view of relative entropy, both the railway networks in 2008 and "the 12th five-year plan" have common downward trends and volatility changes and end in complete disorder, just the same as with those under deliberate attack.
In the meantime, the railway network in "the 12th fiveyear plan" under random attacks still has the following advantages.(1) When a random attack is less than 10 percent, the ordering of the railway network in "the 12th five-year plan" is higher than that in 2008.(2) The railway network in "the 12th five-year plan" is more stable than that in 2008 with stronger robustness.(3) The railway network in "the 12th fiveyear plan" becomes quickly disordered when random attacks reach 90 percent.But the railway network in 2008 becomes quickly disordered when random attacks are at 80 percent or so.(4) The railway network in "the 12th five-year plan" arrives at complete disorder when the random attack is 98 percent.But the railway network in 2008 becomes completely disordered when the random attack is at 95 percent.
From the point of view of the average degree of aggregation, the average degree of aggregation in 2008 and "the 12th five-year plan" declines under random attack.The clustering coefficient will be 0 in the end, which is the same as that under deliberate attack.
The decreasing trend of average degree of aggregation in "the 12th five-year plan" is more stable than that in 2008.The clustering coefficient becomes 0 later than that in 2008.
By comparing the conclusions above, we can see that on the one hand, as a robustness measure, relative entropy could be used to get similar conclusions.On the other hand, the relative entropy measure can provide this information as follows.(1) The overall trend of networks under attack.(2) The rate of measure change at the initial attack stage.(3) When networks will become in rapidly disorder at the initial attack stage.(4) When networks will be in a state of almost complete disorder at the last stage.The common graph theory measure can only show the first and the fourth information.So, taking relative entropy into account can provide more accurate figures and greater amount of information.
Furthermore, representing the order of networks, the relative entropy can be used to analyze the robustness for a view of systems, which means that relative entropy has a measure function beyond graph theory.

Conclusions
This paper deals with the robustness measure approach to railway network.First, the entropy-based robustness measure approach is established.It was found that the entropy-based method provides greater amounts of information than the graph theory measure.Based on the method proposed in this paper, stronger robustness was found for the 12th five-year plan railway network as compared to that of 2008 network, with respect to deliberate attack.Furthermore, the 12th fiveyear plan railway network has greater stability and ordering than the 2008 network.
Further study can be conducted to compare the relative entropy measure to other graph theory and existing research on the robustness of railway traffic networks and transport networks in China.
Network robustness is not unique to railway networks, the relative entropy measure can also be used with other types of complex networks, such as biological networks, internet networks, and social networks.This method to assess the relative entropy is also universal.

Figure 4 :Figure 5 :Figure 6 :
Figure 4: Circuit diagram of China's railway network in 2008 and in the "12th five-year Plan."