The Bilevel Design Problem for Communication Networks on Trains : Model , Algorithm , and Verification

This paper proposes a novel method to solve the problem of train communication network design. Firstly, we put forward a general description of such problem.Then, taking advantage of the bilevel programming theory, we created the cost-reliability-delay model (CRDmodel) that consisted of two parts: the physical topology part aimed at obtaining the networks with the maximum reliability under constrained cost, while the logical topology part focused on the communication paths yielding minimum delay based on the physical topology delivered from upper level. We also suggested a method to solve the CRD model, which combined the genetic algorithm and the Floyd-Warshall algorithm. Finally, we used a practical example to verify the accuracy and the effectiveness of the CRD model and further applied the novel method on a train with six carriages.


Introduction
With the development of network communication technique, an increasing number of novel networks are proposed to substitute for the traditional train communication network (TCN).No matter what functions those systems perform, all of them similarly have a communication network, which is the common, core characteristic of those networks.The topology of such a network is decisive; once improper, the reliability will decrease and the delay will increase.That influences the operation of the whole network.
For a network that is applied to train control or safety monitoring, reliability and delay are two decisive indicators.Information transferred on such networks contains control command (e.g., brake and accelerate) or safetymonitoring information (e.g., speed and axle temperature).If the information transmission fails due to either link brakes or information delays, the performance of the whole train will be seriously affected.Two types of solutions can solve this problem at network-designing stage: one is to add links so as to increase the reliability, and the other is to devise reasonable paths for data transmission to reduce the network delay.
To optimize reliability and delay, most of the existing algorithms deal with them separately.But simultaneous optimization better conforms the demand of automatized design and improves efficiency.Starting from that point, this paper proposes a new algorithm, which helps to design the physical topology structure under cost constraint, so as to achieve maximum reliability.Meanwhile, based on such reliability, this algorithm can devise reasonable data transmission paths and obtain the minimum delay.
Currently, there are many algorithms to deal with the communication network design problem.But most of them only discussed parts of the issue, such as the equipment for specific purpose [1,2] or other characteristics of networks [3,4], and failed to discuss systematically the demand of network design as a whole.Some algorithms emphasized the topology structure design with constrained reliability [5,6].Some, under cost limitation, explored the best physical topology possessing the maximum reliability [7] but did not consider the requirements of delay.Several algorithms are dedicated in improving real time of networks with known physical topology, by way of substituting communication protocols and certain equipment [8].And some studies applied top-down 2 Mathematical Problems in Engineering design to devise train networks but limitedly focused on the hardware [9].
Although some research did try to optimize simultaneously the physical and logical topology [10], they proposed neither systematical models nor corresponding solutions.Usually, such simultaneous optimization is treated as a multiobjective process [11], but that will complicate the solving and make it hard to ensure the ultimate optimal solution.
The present paper takes advantage of the bilevel programing theory [12] to optimize simultaneously under cost constraint, that is, address the reliability and the real time under cost limitation at the same time.The upper level is physical topology design, defined as improving performance of the network through changing the connections between nodes.The lower level is physical topology design, defined as obtaining the highest efficiency by optimizing the transmission paths.Treat the upper-level problem as a discrete network design problem (DNDP) [13] and the lower-level problem as a minimum delay problem (MDP).
The rest of the paper is structured as follows.In Section 2, we point out the problem of bilevel programming in train communication networks and propose CRD model.In Section 3, the solution of the CRD model is illustrated and explained.In specific, such solution is based on genetic algorithm and Floyd-Warshall algorithm, and the key issue is how to deliver data effectively between these two algorithms.In Section 4, we use a numerical example to verify the accuracy of the CRD model and further apply it to solve a practical design problem.In Section 5, conclusion and prospects are provided.

A Programming Model for the Discrete
Network Design Problem on Trains Depending on different design demands of communication networks on trains, there will be various constraints and objectives of DPTCN.Nodes and link elements depend on the characteristics of the particular problem of interest.For example, if we are going to build an economical control network on a train, then we have to, under cost constraint, design a network that has maximum reliability and minimum delay.If, on the other hand, we are going to establish a safetymonitoring network that requires reasonable reliability and is insensitive to cost, what we need to do is to devise a network, under reliability constraint, costing least and possessing the maximum bandwidth.This paper focuses on the first situation and builds the cost-reliability-delay model (CRD model) to achieve the goal.

Basic Ideas of the Bilevel Programming Model for the
Discrete Network Design Problem.As for the train communication networks in train control systems, reliability and delay are two vital indicators.To ensure smooth transmission of information between carriages, every carriage has a node responsible for communication.In the traditional train communication networks such as WTB, all the nodes are arranged in a line; that is, each node connects only with nodes in the next carriage.This is mainly due to the limited transmission distance of the industrial field bus.In practical application, such topology severely restricts networks' performance.One of the defects is that, if one of the links brakes, the whole network will be shut down.Another disadvantage is that data in either end of the network will be extremely crowded.However, the gradually introduced Ethernet can be helpful to overcome these defects.We may take advantage of the Ethernet's longer transmission distance and try to break through the traditional internode connections.With effective connection, one can optimize both reliability and delay.
In fact, DPTCN with limited delay is a DNDP in communication area.It can be described as a leader-follower game.The leaders are the designers of networks, and the followers are the control signal and the safety-monitoring information that can freely choose communication links.The leaders are able to design the physical topology and further to influence the logical topology of information stream, so that one can obtain the most appropriate network.This kind of interaction game can be presented by the following bilevel programming problem: where  = () is implicitly defined by The bilevel programming model contains two submodels.() is the upper-level question, that is, the physical topology planning issue.() is the lower-level question, that is, the logical topology planning issue.In (),  and  are the objective function and the decision vector, respectively; and  is the constraint of the decision vector.In (),  and  are the objective function and the decision vector, respectively; and  is the constraint of the decision vector.
One should treat the equipment of train communication network as consisting of two kinds of elements, nodes and links, and establish the CRD model.In such a model, the physical part changes the existing links through optimization program, so as to satisfy the redundancy requirement and lower the construction cost.On the other hand, the logical topology part concerns how to distribute information stream so as to obtain the minimum transmission time.Information from physical topology design is transmitted to logical topology through a transfer function  = ().Based on that information, logical topology designs reasonable paths for data transmission.

The Physical Topology Optimization.
The present paper focuses on the optimization problem that, under cost constraint, changes existing links to improve reliability.Generally, reliability issues of communication networks assume that the reliability of nodes and links is random, independent, known, and static [14].Because communication on trains is bidirectional, one can use graphs with nonoriented links to describe the network. = (, , ) is a network without parallel link, that is, a nonredundant network, and there is no outlier.Then, the optimization problem can be described as () is the reliability of the whole network; (  ) is the reliability of link   ; (  ) is the reliability of node   ; Ω is the assembly of all operational states of the network, and, here, Ω =  opt ; () is the maximum cost of the whole network; (  ) is the unit cost (per distance) of link   ;   is the distance of link   ; (  ) is the cost of node   ;  is the set amount of given links;  is the set amount of given nodes;  is the function relationship between link reliability and unit cost;  is the function relationship between node reliability and cost.At any time instant, only some links of  might be operational.A state of  is a subgraph (,   ) when   is the set amount of operational links.If   ∈   , then   = 1 or   = 0.
In the upper level, network designers' objective is to obtain the maximum reliability through changing links.Constraint (4) ensures that the overall cost of construction does not exceed the specified maximum cost.Constraints ( 5) and ( 6) describe the function relationship between instrument's reliability and cost.

The Logical Topology Optimization.
After fixing the physical topology, designers need to optimize the logical topology based on different design objectives.This paper pursues minimum transmission delay.Based on that, the optimization of information stream's logical topology is actually to obtain the minimum delay of the specific physical topology designed by upper level.Such logical optimization problem can be presented as () is the overall delay; (  ) is the delay of link   ; (  ) is the delay of node   ; when transmitting data from one node to another arbitrary one, the transmission path is a subset of , described as (  ,   ); Φ is the assembly of all data transmission paths, and, here, Φ =  opt ;  is the function relationship between link delay and link unit price;  is the function relationship between node delay and node cost.
In the lower level, designers devise the way in which information will be transmitted.Constraint (8) ensures the logical topology optimization process is based on the specific physical topology designed by upper level.Constraints (9) and (10) describe the function relationship between network equipment delay and cost.

A Solution Algorithm for the Bilevel Train Communication Network Design Problem
The present paper uses genetic algorithms to solve the discussed problem.The reason for choosing that algorithm is that, firstly, the optimization question in bilevel programming is hard to deal with [15] and, secondly, the optimization of communication network's reliability, delay, and cost is also a NP-complete problem [14].Genetic algorithms are capable of handling these questions.It has been applied to optimizations of various nonlinear problems.A great amount of researches suggest that genetic algorithms can effectively deal with the optimization of networks [16][17][18].But none of the referred studies discussed physical and logical topology simultaneously.
Here, a solution for the CRD model based on the genetic algorithm is proposed: maximize the reliability under limited cost, then transmit the result network to the lower level, and finally obtain the logical topology with the minimum delay.Generally, the reliability and the cost of equipment are proportional, while the delay and the cost are inversely proportional.These relationships provide a possibility to convert the cost constraint of lower level into upper-level's constraint through certain price.The solution process can be described as follows.
Step 1. Set original parameters, including the amount of nodes, the distance between nodes, the maximum cost, the node unit price, the node reliability, the link unit price, and the link reliability.
Step 2. Based on the original parameters, create original gene, design the physical connection between nodes via genetic algorithm, and generate the optimum solution under limited cost.After this, the physical topology design is complete.
Step 3. Determine whether the physical topology meets actual requirement.If so, go to Step 4. If not, record such structure into the unsuitable solution database and go back to Step 2 to find other optimum solutions.
Step 4. Convert the gene of the best solution into adjoint matrix that stands for the physical topology and deliver it to the logical topology module.
Step 5. Find the communication style with the minimum delay between arbitrary nodes.After this, the logical topology design is complete.
Step 6. Determine whether the logical topology meets actual requirement.If so, the overall solution is complete.If not, record such connection style into the unsuitable solution and seek the reason.If it is due to the logical topology, then go back to Step 4. If not, go back to Step 2.
There are several points that need attention.First of all, the logical topology designed here is based on an assumption that the network bandwidth is much larger than the amount of data within the network.This is because the maximum bandwidth of existing TCN is 1.5 Mbps, but it will increase to 100 MBps with application of the Ethernet.Even if the existing control instruction is increased by 10 times, network bandwidth is also an order of magnitude larger than the data traffic.Secondly, the criterion in Step 3 indicates constraints other than cost, for example, no outlier is permitted, length limitations caused by the cables themselves.Thirdly, the criterion in Step 5 is a complementary constraint to prevent the assumption that "the network bandwidth much larger than the amount of data" fails.If the received data of a node exceed the actual throughput, then such topology will be discarded.
Because this solution process is modular, adjustment on specific constraint is convenient and will not affect other modules.There are two key issues: one is the design of genetic algorithm and the other is the method to obtain the minimum delay.These two points will be specifically illustrated as follows.

The Genetic Algorithm for Solving Physical Topology
Optimization Problems.The most outstanding feature of genetic algorithms is that they implement multiple iterations, eliminate bad genes, and filter out the optimal solution.The coding structure of gene and the determination of fitness function, as well as crossover and mutation rule, pose a significant impact on the results and efficiency of genetic algorithms.This paper defines various modules of the genetic algorithm applicable to this study.As there has been a great amount of research successfully using genetic algorithms for network reliability analysis, this paper shifts the focus to how to deliver the genetic algorithm's result to the lower level for further calculation.

Gene-Coding Structure.
As for solving the CRD model, one of the key issues is how to deliver the information between the two levels, and an effective gene-coding structure is decisive to such delivery.The first step of the gene-coding structure is to determine the length of gene.For a train communication network that has   sets of nodes, the relationship of its link amount   and node amount   can be illustrated in Therefore, to describe the reliability of the whole network, all the genes referred to in this paper have 0.5(  +1)  +  bits.|0.5(  +1)  | stands for the link reliability and |  | for the node reliability.
For different nodes and links, one can use different integers to describe their reliability.For example, 1 means the most reliable equipment and 2 less reliable equipment, and so forth.If there are  kinds of equipment, and each bit of every gene ranges among [0, ] in which 0 means there is no link at all.For example, Figure 1 shows a network with 4 nodes and 5 links.
The numbers in Figure 1 stand for the reliability of corresponding equipment.Notice that the train communication network has a feature that differs from other communication networks: nodes in the network are linearly arrayed.This structure is the same as WTB in TCN [19].
Different from traditional methods of solving reliability problems with application of the genetic algorithm, this paper not only pursues the optimal solution, but also focuses on the delivery of the network structure from the upper level to the lower level.Because the gene-coding structure must represent the network structure, which is usually described by adjoint matrixes, the gene-coding structure evolves from adjoint matrixes.Figure 2 shows the relationship between adjoint matrix  and gene .
The first six numbers represent the gene standing for six links, and the last four numbers the four nodes in Figure 1 from left to right.Because train communication networks are full-duplex networks, their adjoint matrixes are symmetric.

The Fitness Function.
The objective of optimization here is, under limited cost, to find the most reliable physical Figure 2: The relationship between adjoint matrix and gene.
topology.Therefore, two elements are necessary in the fitness function: cost and reliability.In genetic algorithms' operation, good solutions on optimal boundary are often the result of breeding between a feasible solution and an infeasible solution.As a result, when setting the fitness function, simply excluding the infeasible solution should be avoided.The right way is to set an effective penalty function that reduces the proportion of infeasible solutions in the population.Concluding from practical experience, the following fitness function has good computational efficiency: In this function,  is the maximum cost.count(rea()) ̸ = 0 is the amount of 0 in 's reachability matrix rea().This constraint ensures that there will be no outlier in the network.() is system reliability.() is the actual cost of the system. is the penalty factor and  < (()).According to actual conditions, the recommended value of  ranges from 0.05 to 0.15.

Genetic Operators.
The efficiency of genetic algorithms depends on the population size, the selection method, the crossover and mutation operators, and the stopping criteria.After a large number of experiments, the most appropriate genetic algorithm parameters for this paper have been found: the population size is 400 and limit the range of each gene in the population to be an integer from 1 to .Choose stochastic uniform as selection method and use one-point crossover with a crossover probability of 0.8.Select mutation-uniform with rate of 0.03.The termination criterion is 500 generations.

The Algorithm for Solving Logical Topology Optimization
Problems.Assume that, in train communication network, the minimum delay only depends on the shortest path between nodes, and such a path has no relation with link length.In other words, the time for delivering a message is only affected by the number of nodes such as information passed.The reasonableness of this assumption is justified in the following.It is well known that, no matter which architecture of monitoring network is, the transmission delay  can be similarly described as where  td is the transmission delay,  pd is the processing delay,  sd is the sending delay, and  qd is the queuing delay.Among the four parameters,  td is relevant to links, while the remaining three relate to nodes.One can test the delay of links and nodes separately.The tested node is a dedicated Ethernet switch for trains that conforms to EN50155 standard [20].The tested link is an Unshielded Twisted Pair 6/100Base-TX that conforms to TIA/EIA-568B.2-1standard [21].The test process exactly complies with RFC-2544 standard [22].The node delay is shown in Table 1 and the link delay is presented in Table 2.
The delay in Table 1 refers to the time period during which the first bit of data frame enters a node and the last bit leaves the node.When testing, the node's carrier was 50% of its limit.The result reflects nodes' average delay under various data frames.
The delay in Table 1 refers to the time period during which the first bit of data frame enters to a link and the last bit leaves the link.The testing communication rate was 100 Mbps.The result reflects links' average delay under various link lengths.
Concluding from the testing results, nodes' delay is very trivial when compared with links' delay.Therefore, the transmission delay can be ignored.Additionally, because the sending delay and the queuing delay are products of forwarding, the optimization of delay concerns how many time specific information has been forwarded, that is, how many nodes it has passed.
As the objective of this paper is to find the shortest communication path from one node to another in train communication system, which is a kind of all-pairs shortest path problem, we apply the Floyd-Warshall algorithm.Such an algorithm had been greatly useful in solving the shortest path problems of networks [23][24][25].The key issue here is

Numerical Examples
In the present section, two examples will be posed to verify the accuracy and the effectiveness of the CRD model, and Example 2 will solve a practical problem in specific.
Example 1. First, assume there are three kinds of links and three types of nodes, and they have different reliability and unit cost.The determination of parameters is shown in Tables 3 and 4.
There are four nodes in the network aligned arrayed linearly, and the node intervals are 40 meters.Concluding from the calculations, to build a network that has the lowest reliability, as illustrated in Figure 3(a), the minimum cost will be 6,960.In contrast, to establish one that has the highest reliability, as presented in Figure 3(b), the maximum cost will be 18,000.Notice that the broken line in Figure 3(b) does not mean the actual length increases, and it is merely for clear illustration.
Set the cost limitation as 6,960 and 18,000; the penalty factor  = 0.1.Run the proposed algorithm and see whether the outcome is accurate.The results are shown in Figure 4.
The figure presents the outcome of the genetic algorithm and the shortest communication path result of logical topology optimization.As can be seen from the figure, when the cost constraint is 6,960, we get the optimal result at about the 100th iteration, and the optimal solution is about −0.2202.
When the cost constraint is 18,000, we get the optimal result at about the 10th iteration and the optimal solution is about −0.8147.The numbers and dashes under Figure 4 stand for the optimal path results.For example, "A-B-C-D" means the optimal path for delivery from node A to D is to pass node B and node C. It can be reasonably concluded that both levels of the proposed bilevel programming algorithm can yield feasible solutions.
Example 2. There is a train with six carriages, and one node of each carriage needs to be connected with nodes in other carriages.This structure has been widely applied in actual metro trains [26].Because each carriage is an average 25 meters long and considering the actual wiring type, wiring between carriages requires 30-meter-long cable.Due to signal attenuation, the maximum link length in the network cannot exceed 100 meters.The cost and the reliability are given in Tables 1 and 2. Set the cost constraint as 14,500 at minimum and 22,000 at maximum.Under these conditions, find the optimal communication style with the highest reliability and the shortest delay.
Run the CRD model and obtain the optimal solutions of nodes' connection, as shown in Table 5 and Figure 5.
Compare the optimization results of these two sets of conditions: when the cost increases, the reliability of train communication network increases.Meanwhile, while the number of links increases, the delay of networks decreases.Concluding from the specific analysis, the right way to obtain networks with reasonably high reliability as well as low delay is to add links so as to form redundant connections, rather than rely on choosing equipment with high reliability.
It can be concluded from the actual example that the CRD model is able to optimize simultaneously the physical and the logical topology of networks, changing the traditional methods and reducing parameters' repeat imputation as well as enhancing working efficiency.Seeing from the actual example verification, the CRD model can find most reliable links under cost constraint and the transmission method with minimum delay, so that it satisfies the ultimate optimization objective.Therefore, the CRD model is feasible and effective to solve practical problems.However, we did not consider the variation of reliability when a link passes carriages, which should be an additional condition for logical topology design.Although this problem does not affect the cost, reliability, and delay in the proposed algorithm, it definitely influences the choice of equipment during construction.And this will be the direction for future research.

Conclusion
In

Figure 1 :
Figure 1: A simple train communication network.

Figure 3 :
Figure 3: Sample networks with four nodes.

Figure 4 :
Figure 4: The results of Example 1.

Figure 5 :
Figure 5: The physical links of Example 2.
represents the links therein, and  is the adjoint matrix describing the relationship between nodes and links of .The optimization of the physical topology of train communication networks means finding the optimal solution  opt of , with constraints of  = [1, 2, 3, . ..] and  opt ∈ .Correspondently, use another graph  to model the logical topology; that is,  = [, , ,  * ,  * ], in which  * illustrates the actual links required by data stream, and  * is the adjoint matrixes of the network data stream.The optimization of the logical topology is to get the optimal solution  opt of , with limitations that  = [1, 2, 3, . ..] and  opt ∈ .
2.1.A Basic Description of the Design Problem of Train Communication Networks.The design problem of train communication networks (DPTCN) is to seek the most appropriate network through adjusting the physical and logical topology of nodes and links.As communication networks between trains are bidirectional, here we model the physical topology with a graph ; that is,  = [, , ], where  stands for the nodes in a network,

Table 3 :
The reliability and cost of link types.

Table 4 :
The reliability and cost of node types.

Table 5 :
The optimal solutions of Example 2.
this paper, a general model is proposed to describe the problem of train communication network design.Such design task is divided into two processes: the physical topology optimization and the logical topology optimization.Designing train communication network under cost constraint is the focus of this paper.Taking advantage of the bilevel programming theory, this paper proposes CRD model, which is able to balance the cost, real time, and delay, to address the design problem.The upper level of CRD model discusses the way to obtain the network with maximum reliability under cost constraint.The lower level explores the method to get the minimum delay for the physical topology designed by upper level.The application of CRD model shows that the cost of network designing is proportional to reliability while inversely proportional to delay.What's more, this paper puts forward a solution for the CRD model that combines the genetic algorithm and the Floyd-Warshall algorithm.With proper configuration, such solution can comprehensively reflect genes of networks, as well as effectively incorporating the two algorithms referred.Finally, the accuracy and the effectiveness of the CRD model and its solution are verified by a practical example.Another actual application of the CRD model is made on a metro train.The result justifies that the CRD model has certain guiding significance and can provide theoretical foundation to construction projects.Emphasis of future researches will be on the weight of nodes in network reliability.