Controllability of Train Service Network

Train service network is a network form of train service plan. The controllability of the train service plan determines the recovery possibility of the train service plan in emergencies. We first build the small-world model for train service network and analyze the scale-free character of it. Then based on the linear network controllability theory, we discuss the LB model adaptability in train service network controllability analysis. The LB model is improved and we construct the train service network and define the connotation of the driver nodes based on the immune propagation and cascading failure in the train service network. An algorithm to search for the driver nodes, turning the train service network into a bipartite graph, is proposed and applied in the train service network. We analyze the controllability of the train service network of China with the method and the results of the computing case prove the feasibility of it.


Introduction
With the increasing trend of the emergencies, line planning in emergencies is becoming a more and more important topic.Line plan is a relative stable operating technical file which does not need revising in a year or an even longer period of time.But when serious emergencies occur, the railway will suffer greatly from the emergencies and the line plan must be adjusted.The line plan can be turned into train service network and the service network can be studied with the emerging theories.The characteristics of the train service network, such as the brittleness, robustness, and controllability, must be studied before designing the line plan with the network theories.
Railway network began to attract attention of the researchers from the study on the complex characteristics of the railway geographical network and the service network, such as the fractal dimension, the average network distance, and the average clustering coefficient.Benguigui and Daoud studied the fractal characteristics of the railway network structure and discussed the relation between the railway fractal dimension and the development of the city size [1].Sen et al. studied the structural properties of the Indian railway network in the light of recent investigations of the scaling properties of different complex networks.They did rigorous analysis of the existing data and found that the Indian railway displayed small-world properties, with 2.16 average path length and 0.69 average clustering coefficient [2].Latora and Marchiori proposed a more refined kind of analysis on the subway network and gave precious insights on the general characteristics of real transportation networks, eventually providing a picture where the small-world comes back as underlying construction principle [3].Seaton and Hackett calculated the clustering coefficient, path length, and average vertex degree of two urban train line networks.The results were compared with theoretical predictions for appropriate random bipartite graphs [4].Zhao et al. proved the train service network of China to be a small-world network with 3.27 average path length and 0.83 average clustering coefficient [5].Liu and Song built a topology model for Guangzhou subway network and calculated the complex indices.They concluded that the network displayed the stochastic characteristics and analyzed its reliability [6].We proved that the nodes degrees of China railway network had the characteristic of power law distribution and analyzed the dynamic characteristics of the train service network [7] and improved the networked timetable stability with bilevel programming [8] and evaluated timetable stability with information entropy theory [9].Wang et al. analyzed the characteristics of train service plan and constructed a twolayer optimization model to design the line plan of the trains [10].Wang et al. presented the definition of the invulnerability of the railway network and constructed an evaluation model with two indices: network accessibility and local accessibility.They analyzed the effect of snow disaster on railway network [11].We built a capacity-load model to simulate the evolution process of the railway service network in case of deliberate attack and random failure, respectively, and analyzed the brittleness of the train service network [12].
All the references are about some characteristics of the railway geographical network or the train service network.But they did not mention the controllability of the train service network.But as we all know, the control technology can optimize the complex network and improve its performance.For example, it can improve the robustness and the stability of the train service network.
Generally, the goal to control the complex network is to make the network reach the expected status by imputing signals into the selected control nodes.There are two basic problems: the feasibility and the availability of complex control [13].The feasibility refers to the possibility to control the complex network.The availability is to reduce the cost for controlling.It is generally believed that the research achievements on control theories should be introduced on the controlling of complex network.
Liu et al. built the theoretical model for the control of complex network [14], which is called LB model.We can find the minimal set of driving set for any linear time invariant complex network.Some conclusions have been drawn, which are as follows: (a) the size of the minimal set of driving set depends greatly on the degree distribution; (b) there is a trend that the driver nodes are usually not the hubs, called hubs.And the second conclusion is irreconcilable with common sense [15], for it is taken for granted that the highly influential nodes should be the hubs.Kitsak et al. pointed out that the hub nodes are not the most highly influential nodes in the network dissemination [16].
LB model is designed for the controllability study on the time invariant complex network, which is too brief on the assumption of the signals transmission.It is taken for granted that the signals received by any nodes must be transferred to the connected nodes.However, there are many nodes that block the signals transmission in the actual network, such as the nodes in the disease spreading network [17], the disabled nodes in the power grid, and the nodes in the behavior communication network [18].Such nodes are called immune nodes (short for IND).So we can see that the immune nodes are the nodes that can block the control and other signals or information in the network.
Lü et al. refined the complex network control model based on propagation immunization.They adopted four methods which belong to random immunization strategy and targeted immunization strategy to determine the immune nodes and analyzed the controllability of 14 real networks [19].This paper constructs the train service network and studies the controllability.We define the controllability of the train service network and propose several indices to measure the controllability.A controllability control model for train service network is built based on the immune nodes constructed.Then the data from Beijing-Shanghai are listed and we analyze the controllability of the train service network.

Train Service Network
The network of train flow in China is a complex network, in which the stations are the nodes.And we can draw an arc between two stations if the same train passes through the two stations.We introduce the weights of the arcs between the station nodes, which denote the times that the two stations are connected according to the relative trains.
In [5], the authors pointed out that the train flow network has shorter average route and the bigger cluster coefficient.And the relationship between average distance of the network and the number of the nodes is ⟨⟩ ∝ ln().It proves that the train flow network is a small-world network with the scalefree characteristic.It does not do work on the robustness and vulnerability of the train flow network.

Small-World Model for Train Service Network.
There are three basic statistical characteristics of the complex network according to the complex network theory.They are average length of the network (), cluster coefficient (), degree, and degree distribution  ∼ ().The distance   between two nodes  and  is defined as the number of the arcs on the shortest path between two nodes.The average path length of the network  is defined as follows: Many networks have the clustering influence that reflects the clustering degree of the nodes in the network.Set a node  that is connected with   arcs.The   nodes are called node 's neighbor.And there are at most   (  − 1)/2 arcs between the   nodes.  is the real number of the arcs between the   nodes.And the ratio of   to   (  − 1)/2 is called the cluster coefficient   .That is as follows: And the cluster coefficient of the whole network is the average cluster coefficient of each node in the network.That is as follows.We can see that the relation between node degrees and nodes number shows the power law distribution characteristic; see Figure 1.The train service network in China is a typical scale-free network.

Scale-Free
In China railway network, 90% nodes have the load of more than 10.The nodes which are loaded more than 10 6 are less than one 10 6 th.The nodes which are loaded more than 10 7 are rare; see Figure 2. It implies that the train service network in China can meet the requirements of most of all the passengers.The majority of the passengers can reach more than 10 stations without any transfer.In addition, there are a certain number of large stations with heavy load, which means that the passengers can reach many stations without transfer from these stations.And they have many trains to select when travelling to a station from these stations.It is the base to study the controllability of the train service network.

Linear Network Controllability
If we control network to develop toward a certain status, we must control all the nodes' status of the network.It is necessary to input the control signals to some nodes.The signals inputs are the control signal.And the nodes needing inputting signals are the driver nodes.And the set which contains the minimal number of the driver nodes to control the network is the minimal driver nodes set (MDNS).And if we get the MDNS, we have already solved the most important problem in network control: the feasibility and the accessibility.Liu et al. proposed the method to find the MDNS [14], based on the structure controllability theorem [20] and the minimum inputs theorem [14].
Liu et al. had proved that the controllability problem can be solved by searching the directed graph maximum match problem [14].We transform a directed graph () into a bipartite graph () = ( +  , is the set of the arcs.Then we search for the maximum matched set of this bipartite graph.The nodes which are pointed at by the maximum arc in  −  are the matched nodes.The left nodes are the unmatched nodes.If the number of the unmatched nodes is not 0, then the driver nodes are the unmatched nodes, or else a driving node can be any node in the network.Then the number of the driving node is 1.

LB Model Adaptability in Train Service Network Controllability Analysis. LB model is the most valuable model
to analyze the controllability of the complex network.Some conclusions help us to understand the control of the complex network, such as the driver nodes that do not tend to be the hubs.It is necessary to study LB model further to apply it to the controllability of the real complex network, for the real complex networks have their special characteristics [18].So the model must be revised when using it to analyze the controllability of a specific complex network.And LB model is designed towards the linear time invariant complex network, while the real network is usually nonlinear and time variant complex network.The train service network is a time variant complex network, which cannot be analyzed with LB model.So the LB model is improved in this paper to be adaptive with the train service network.

A Simple Computing Example Based on the LB Model.
The stations on the railway network are taken as the nodes of the train service network.Then we give the definition of the edges of the network.If a train passes through two stations, there should be an edge between the two station nodes.Thus the train service network can be built.
The nodes of the train service network are the stations of Beijingnan, Dezhoudong, Jinanxi, Xuzhoudong, Nanjingnan, and Shanghaihongqiao.According to the constructing method, train G1 causes three arcs to be added on the train service network.The arcs are: Then the train service network turns to be a more complicated one, shown in Figure 4.
Then the arcs generating based on G113 are added on the train service network:  The train service network is shown in Figure 5.So the train service network controllability problem can be transformed to a problem of searching the matched nodes and unmatched nodes.That is to say, the key is to search the maximum matched nodes sets in the bipartite graph which is changed from the train service network.
The train service network in Figure 5 is turned into a bipartite graph; see Figure 6.V + 1 and V − 1 are for Beijingnan, V + 2 and V − 2 denote Dezhoudong, V + 3 and V − 3 stand for Jinanxi, V + 4 and V − 4 are for Xuzhoudong, V + 5 and V − 5 denote Nanjingnan, and V + 6 and V − 6 represent Shanghaihongqiao.The set of the red edges in Figure 7 is a maximum match of the bipartite graph for a sample train service network.We can see that V 2 , V 4 , V 5 , and V 6 are the matched nodes of the network in this case.So the driver nodes of the sample train service network are V 1 and V 3 .Another maximum math is shown in Figure 8.And it is obvious that V 1 and V 3 are the unmatched nodes, which are the driver nodes in the same time.

Improved LB Model for Train Service Network
Controllability Analysis

Definition of Immune Nodes.
To adjust the LB model to analyze the train service network controllability, we improve the LB model.We will take some measures in emergencies to control the train service network, trying to keep the edges in the service network.That is to say, we try to keep the edges in the service network, for they imply that the passengers can reach the directed station without any transfer.For example, we will try to keep the edge from Xuzhoudong to Shanghaihongqiao when an emergency occurs and Xuzhoudong station or on the railway section from Xuzhoudong to Shanghaihongqiao; see Figure 6.We take technical measure to make the train finish the journey as planned.But there is the possibility that the train is taken off on station Xuzhoudong.So there is a blocking-up phenomenon in the train service network.Xuzhoudong station is called an immune node.
Although the blocking-up phenomena are very universal in the train service network, it is very difficult to tell which are the immune nodes in the network, for we cannot forecast where the emergencies will occur and which stations and sections they will affect.A popular measure to describe the possibility of the occurring of the emergencies is to set an occurring probability.Then the typical LB model cannot describe the train service network controllability, because there is uncertainty in the maintenance of the edges.Set  to be the probability that the edge node  to node  is remained.

Cascading Failure.
Emergencies may reduce the capacities of the stations and sections in the railway network.The load on the nodes must be distributed on the other nodes to meet the passengers' requirements.Cascade failure may occur when the loads are redistributed.The method to measure the quality of the network is to calculate the average efficiency of the train service network.
Definition.The signals which can disable the nodes when they are input into the nodes are called the disabling signals.They are marked as  − () = ( − 1 (),  − 2 (), . . .,  −  ())  .A disabled node is sure to affect the connected edges and the connected nodes.Some of the nodes are possibly disabled because of the transferred load from the already disabled nodes.The disabling process may spread in the network, causing the cascading failure.When the load of each node is below its capacity, the network reaches the stable status.
When a node is disabled, it will be unreachable and it cannot transfer the control signals any further.It is necessary to input control signals into the disabled nodes.
Set   to be the weight of edge from node  to node  and   ∈ [0, 1].The bigger the   is, the higher the transfer efficiency of the control signals will be.The matrix {  }, a × matrix, is the efficiency value matrix.The original value of every matrix element is 1.Set   () to be the load of node  at time .It is the number of the efficiency optimal paths through the node at time .Efficiency path is a path which has the largest  * = ∑(1/  ) −1 among all the paths which run from node  to node .The load allowance is   =   (0) ( ≥ 1), where  is an allowance parameter.To simulate the dynamic evolution process of the train service network and analyze the controllability, the evolution equation is defined as follows: ,   () >   ,   (0) ,   () ≤   . ( When a node is deleted from a network, the load of the node will be distributed to the other nodes of the network.Then it may cause the other nodes to be overloaded and disabled.
Another round of load re-distributing process will be carried out.The cascading failure occurs.The damage degree can be measured by the following equation:

The Computing Case of Train Service Network Controllability Model
We can see that the key to analyze the controllability of the train service network is to determine which nodes should be affected to control the train service network.For the emergencies that occur randomly, we simulate the effect on the nodes and determine the immune nodes with randomized policy.The relation between the nodes importance and the controllability of train service network is studied.The importance of the nodes is measured with the three indices of complex network, the degree, the betweenness, and closeness.
(i) Closeness is the index to measure the accessibility of reaching a node from another road in the network.Its value is the reciprocal of the summary distance from one node to all the other nodes.The larger the compactness is, the smaller the summary distance is which means that the nodes are closely connected in the network.
(ii) Betweenness is the quotient of the number of the shortest paths passing through the node and the total number of the shortest paths in the network.Betweenness described the influence of a node in the network.The larger the betweenness is, the more important the node is.Betweenness reflects the position of a person in the social relation network, which is most valuable to find and protect the key human resources.
We take the railway system of China as a computing case.There are 3361 stations which offer passenger transportation service in China railway system.And there are 2469 passenger trains operating on the railway network in 2014.We did the experiments with two strategies.One is to simulate the reconstruction of the train service network by deleting nodes from the network intentionally.The other is to delete the nodes of the train service network randomly.When deleting nodes from the network intentionally, we select 5% nodes as the immune nodes and check the increase of the driver nodes in the network.We did the experiments 10 times and calculated the average value.When deleting the nodes from the network, we identify the driver nodes according to the closeness and betweenness value.Figure 9 shows the increase of the driver nodes in the two conditions.
We can see that the driver nodes increase more rapidly when the nodes are deleted intentionally than when they are deleted randomly.The increasement ratio of the diver nodes ranges from 18% to 23% when the nodes are deleted intentionally.And the range is from 13% to 16% when the nodes are deleted from the train service network randomly.
According to the improved LB model, the number of driver nodes in the train service network is 235, which is 6.99% of all the nodes; see Figure 10.And among them there are 197 stations that have the degrees which are lower than the mean degree, with the ratio 83.83%; see Figure 11.In the whole train service network, the ratio of the nodes which have the degrees lower than average is 81.91%; see Figure 12.We can see Ratio of driving nodes with degrees higher than average Ratio of driving nodes with degrees lower than average Ratio of nodes with degrees higher than average of the whole network Ratio of nodes with degrees lower than average of the whole network that the driver nodes tend to be the nodes with lower degrees; see Figure 13.

Ratio of driving nodes Ratio of nondriving nodes
When the node (station) with the largest degree is deleted from the network, some other nodes will be disabled.The whole performance of the network () will decrease, compared to the performance in normal status with ( 0 ) = 0.041.The smaller the  is, the more seriously the () will be reduced.If  is not too small, random deletion of the nodes will not affect the efficiency of the train service network, which means that the controllability is stable; see Figure 14.When  reaches 1.6, there is almost no change with ().

Conclusion
We identify the immune nodes with the degree number, betweenness, and closeness and observe the effect on the control availability of the train service network.Compared to the random immune, the immune nodes identified by the three indices (the hubs evaluating level with the three indices are taken as immune nodes) will improve the difficulty   of controlling the train service network.And the driver nodes tend to be the nodes that have lower degree.Another conclusion is that high betweenness and closeness can affect the controllability of the train service network by blocking the transmission signals, which agrees with the findings on other kinds of complex networks.This study on the train service network controllability enlarges the application in engineering field of complex network control theory.And it can afford valuable supporting information when designing the train service plan, especially in emergencies.Future research work will focus on the control and optimization methods for the train service network.

Figure 1 :Figure 2 :
Figure1: Distribution of the station degree of China railway.Note: the number on lateral axis is degrees of the stations.The number on the vertical axis is the number of the stations that have relative degree.

6 Figure 7 :
Figure 7: The maximum match of the bipartite graph for the sample train service network.

6 Figure 8 :
Figure 8: Another maximum match of the bipartite graph for the sample train service network.

Figure 9 :
Figure 9: Driver nodes increasement under intentional and random cases.

Figure 10 :
Figure 10: Ratios of the driver nodes and the nondriver nodes in the train service network.

Figure 11 :
Figure 11: Ratios of driver nodes with different degrees.

Figure 12 :
Figure 12: Ratios of nodes with different degrees.

Figure 13 :
Figure 13: Comparison of ratios of the nodes and driver nodes.

Figure 14 :
Figure 14: Relation between train flow network efficiency and overload coefficient under the condition of randomly deleting and intentionally deleting the station nodes.
They denote the node sets from line and column, respectively.