Optimizing High-Speed Railroad Timetable with Passenger and Station Service Demands : A Case Study in the Wuhan-Guangzhou Corridor

This paper aims to optimize high-speed railroad timetables for a corridor. We propose an integer programming model using a time-space network-based approach to consider passenger service demands, train scheduling, and station service demands simultaneously. A modified branch-and-price algorithm is used for the computation. This algorithm solves the linear relaxation of all nodes in a branch-and-bound tree using a column generation algorithm to derive a lower-bound value (LB) and derive an upper-bound value (UB) using a rapid branching strategy. The optimal solution is derived by iteratively updating the upperand lower-bound values.Three acceleration strategies, namely, initial solution iteration, delayed constraints, and column removal, were designed to accelerate the computation. The effectiveness and efficiency of the proposed model and algorithm were tested using Wuhan-Guangzhou high-speed railroad data. The results show that the proposed model and algorithm can quickly reduce the defined cost function by 38.2% and improve the average travel speed by 10.7 km/h, which indicates that our proposed model and algorithm can effectively improve the quality of a constructed train timetable and the travel efficiency for passengers.


Introduction
A train timetable is a technical document that describes the operation of trains on certain railroad sections and the trains' departure and arrival times or passing times at stations.With the rapid development of high-speed railroads in China and various other countries during the last decade, the train timetabling problem (TTP) for high-speed railroads has become a new challenge for railway operators.The TTP for high-speed railroads differs from the traditional TTP in several ways: (1) The mismatch between transportation demand and capacity causes the full use of the transportation capacity of railroads and the minimization of travel times of trains to be the main goals for the traditional TTP.In general, the TTP for high-speed railroads focuses on the maximal satisfaction of passenger travel demands and the improvement in customer service quality.Due to the high speed and high density of the high-speed railroad, passengers may have more choices of when to travel.Thus, the passenger flow of high-speed railroads shows distinct volatility in different periods.To satisfy the volatility characteristics of high-speed railroad passengers, we set a feasible departure time range for each train at its starting station according to the train service plan, as mentioned in Section 2.3.
(2) Passenger trains and freight trains run on traditional railroads, where the passenger trains have higher priority than freight trains.Thus, passenger trains may overtake freight trains at some stations.Medium-and high-speed passenger trains run on high-speed railroads, where mediumspeed trains and high-speed trains have the same priority.Thus, the number of times overtaking occurs on high-speed railroads may be less than that of traditional railroads.We propose a punishment coefficient to control the number of occurrences on high-speed railroads in Section 3.1.
(3) The maintenance window for high-speed railroads is generally set at night, and maintenance requires a considerable length of time.The skylight for traditional railroads is more flexible and requires a shorter length of time.
The TTP has received considerable attention in recent decades as a fundamental aspect of railroad transportation management.Many researchers have explored train scheduling.Frank [1] was one of the earliest researchers to conduct a mathematical analysis of train timetables.He began to explore train scheduling for two-way railroad operations on a single track in 1966.Szpigel [2] converted the singletrack TTP into a job scheduling problem and proposed a mixed-integer programming model, in which minimizing the total travel time was defined as the goal, which was subject to interval time constraints related to train overtaking or crossing.A branch-and-bound algorithm was employed for the computation of this model; however, it could be applied to only small-scale computations.Serafini and Ukovich [3] proposed the periodic event scheduling problem (PESP) in 1989, after which Schrijver and Steenbeek [4] applied the PESP to solve the TTP and pioneered research on periodic timetable optimization problems [5][6][7][8][9][10][11].Periodic timetables help passengers memorize accurate departure times and effectively reduce the scale of the computation.Currently, periodic timetabling is the preferred choice of many railroad enterprises.Brännlund et al. [12] proposed a 0-1 integer programming model based on the discretization of time, the goal of maximizing the benefits for all trains, and the need to satisfy track capacity constraints.The Lagrange relaxation algorithm was employed for the computation of this model.Caprara et al. [13] explored periodic timetables based on graph theory and constructed an integer programming model for which the Lagrange relaxation algorithm was also used for computation.This model can be effectively applied to large-scale computations.Zhou and Zhong [14] explored the single-track TTP for conditions of limited track resources with the goal of minimizing the total travel time with safety headway constraints.Cacchiani et al. [15] proposed a time-space network-based optimization model for the TTP using a full timetable for a train as a variable.A column generation algorithm was utilized for the computation of this model.He et al. [16] explored the high-speed railroad TTP based on a time-space network model and the branch-andprice algorithm.They implemented a punishment value to reduce the deviations of train start times and the number of occurrences of train overtaking.They also implemented a strategy of increasing the amount of stopping at stations to increase the possibility of train overtaking but they did not consider the strategy of reducing the amount of stopping at stations to shorten passenger waiting times.These studies have yielded useful methods for the optimization of the TTP.However, due to the hierarchical optimization of traditional transportation management processes, passenger demands were not directly considered in these studies.
Another subset of previous studies has focused on passenger demands.Ceder [17] proposed an optimization framework for constructing bus timetables using passenger loading data.This framework can be used to synchronize the departure times of vehicles under dynamic passenger demands.Peeters and Kroon [18] employed the branch-andprice algorithm to solve the rolling stock scheduling problem for a given train timetable and passenger seat demands.Zhou et al. [19] constructed a bilevel programming model to optimize a passenger train operation plan and diagram with the goal of maximizing the profits of the railroad enterprise.Passengers can determine whether to travel by train and can select their transfer schemes based on ticket prices, departure and arrival times, switching times, and congestion charges.The railroad enterprise can improve its train operation plan and diagram based on passenger demands.Kaspi and Raviv [20] proposed a model for comprehensively optimizing a train operation plan and diagram with the goal of minimizing the train operation costs and passenger travel times based on time-varying passenger demands.This model did not have any train capacity constraints; this assumption reduced the computational complexity but yielded results that were inconsistent with the actual situation.Canca et al. [21] proposed a nonlinear integer programming model with the goal of minimizing passenger waiting times and the operational costs for the railroad enterprise.In this model, the departure and arrival times were determined based on dynamic passenger demands.Wang et al. [22] proposed an event-driven model that included departure events, arrival events, and events that correspond to changes in passenger arrival rates.They also constructed a nonlinear programming model with the multiple goals of minimizing the total passenger travel time and optimizing the energy consumption based on passenger transfer behavior in a railroad network.High-speed railroad operations enable highly dense train schedules with the ability to satisfy time-varying passenger demands.Niu et al. [23] proposed a nonlinear mixedinteger programming model based on time-varying origindestination (OD) passenger demand matrices and explored the high-speed railroad TTP with the goal of minimizing passenger waiting times.Yue et al. [24] proposed a mathematical model for optimizing high-speed railroad timetables to simultaneously consider both passenger service demands and train scheduling.Lagrangian relaxation and a column generation algorithm were employed for computation.They verified the effectiveness of their model and algorithm via a case study of the Beijing-Shanghai high-speed railroad.
Stations are important nodes in a high-speed railroad transportation network; they are the locations of events, such as passenger boarding, alighting, and transfer.Therefore, some researchers have studied the simultaneous optimization of passenger and station service demands with the main goal of optimizing the train stop schedule plan.Suh et al. [25] implemented a skip-stop strategy for subway service planning based on OD matrix information, distances between stations, headways, and maximum link speeds.Deng et al. [26] proposed a bilevel programming model for train stop schedule planning based on the passenger transfer demands, the level of stations at which trains stop, the operational capacities of the train stations, and other factors.The objective function of the upper programming model included the generalized minimum travel cost and the number of train stops, whereas the lower programming model was a passenger flow assignment model with multiclass user equilibrium based on the train stop schedule.Zheng et al. [27] constructed a 0-1 integer programming model that considered skip-stop patterns, with the goal of minimizing the total passenger travel time.The tabu search algorithm was employed for computation.al. [28] analyzed problems encountered with conventional train stop schedules and proposed a high-speed railroad stop schedule optimization model, in which the main constraints were related to the node service frequency, the interstation service accessibility, and the number of one-train stops, and the goal of optimization was to minimize the total number of stops at stations.They also quantitatively analyzed the effects of various factors on the construction of train stop schedule plans.Current research is primarily focused on the station stop schedules and does not consider the simultaneous optimization of train timetables.
We propose a new high-speed railroad train timetable optimization method using a time-space network-based approach and a modified branch-and-price algorithm to simultaneously consider passenger service demands, train scheduling, and station service demands.Figure 1 presents an overview of the method.The input data include the timetable parameters, passenger service data, and station service data.
First, the timetable parameters, such as the section travel times and dwell times at stations, are acquired from the existing diagram.Second, the service frequency of each station OD pair is inferred from the average daily passenger flow data provided by the passenger ticketing department.Last, the number of sidetracks at each station is determined based on the station graph.The stations are divided into nodes of different grades in accordance with the size, geographic location, and passenger flow data of each station.The service frequency of each station is determined according to its node grade.We propose a new integer programming model with constraints related to OD service frequency, station service frequency, and various operation time standards, with the goal of minimizing the deviation from the predefined train start times, the total number of train stops, and the dwell time at stations.The details regarding the modified branchand-price algorithm and the three acceleration strategies developed for this algorithm are discussed in Section 4 of this paper.The output is a near-global-optimal train timetable.
The four main contributions of our study can be summarized as follows: (1) We propose a new time-space network method to represent a train stop schedule plan and the comprehensive optimization of the train timetable.We use an associated labeling method to address the corresponding challenge of unconnected arc segments.
(2) We propose a new integer programming model that simultaneously considers passenger service demands, train scheduling, and station service demands.
(3) To calculate the pricing problem (PP), we employ a new method based on the A-star (A * ) algorithm and the Shortest-Path Faster Algorithm (SPFA) to search for the shortest path in the unconnected and dynamic time-space network.
(4) We propose a modified branch-and-price algorithm and design three acceleration strategies for this algorithm: initial solution iteration, delayed constraints, and column removal.These algorithms and strategies can be used to quickly solve a very-large-scale TTP.
The remainder of this paper is organized as follows: In Section 2, we elaborate on the research problem and introduce the construction of the time-space network.In Section 3, we describe the new integer programming model.In Section 4, we discuss the extended branch-and-price algorithm and the three acceleration strategies.Section 5 presents a verification of the effectiveness of the proposed algorithm and model using data from the Wuhan-Guangzhou high-speed railroad.In Section 6, we discuss our conclusions and offer suggestions for future studies.

Problem Statement
This paper considers the high-speed TTP along one train direction of a two-direction railroad line under passenger demands, where the stations are sequentially numbered as 1, 2, ⋅ ⋅ ⋅ ,   .Let  = {1, 2, ⋅ ⋅ ⋅ ,   } be the set of stations.The set  = {1, 2, ⋅ ⋅ ⋅ ,   } of trains travel along the line from the predetermined starting stations to the predetermined ending stations according to the train service plan.Let  ∈  be the index of times, where  = {0, 1, ⋅ ⋅ ⋅ ,   } is the planning time horizon with equal one-minute intervals.

OD Service
Frequency.The OD service frequency, which is denoted by  ,  , is the minimum number of direct trains that serve passengers who travel between the stations of an OD pair.The relationship between the OD service frequency and the passenger flow data can be expressed by the following equation: where  ,  represents the average number of passengers who arrive at station  ∈  to travel to station   ∈  :   ̸ =  daily on the line, according to the passenger ticketing department;   is the capacity of train  ∈ , i.e., the maximum number of passengers accommodated by train ; and  ,  is the train seat occupation rate between station  and station   .
Figure 2 lists the OD service frequencies of a line.Light red shading indicates OD sections with lower service frequencies.These sections have poor travel access, and passengers have few travel options.Yellow shading indicates OD sections with zero service frequency.Passengers on these OD sections must switch stations.As shown in Figure 2, the OD service frequencies related to station K are low; this situation is attributed to the low grade of the station and its low passenger traffic.A low OD service frequency also causes low attractiveness to passengers from sections related to station K, which inhibits an increase in passenger traffic.Therefore, a reasonable OD service frequency is an important condition for guaranteeing accessible services among stations and the satisfaction of passenger travel demands.

Station Service Frequency.
Station service frequency is the number of trains that stop at and start from a station during a certain period [28].In actual calculations, the grade of a station is determined based on various factors, such as political, economic, and cultural factors of the city in which the station is located and its attractiveness to passenger traffic, infrastructure, and operational capability.The service frequency of a station is determined in accordance with its grade.

Predefined Start Times.
In theory, the start time of a train can have any value within a possible start time range.In practice, reasonable start time ranges can ensure that the train timetable exhibits reasonable performance in terms of the evenness of the train distribution and the satisfaction of the time-varying passenger demands.Suitable start time ranges can also tighten the feasible space of the resulting optimization models and reduce the search time.The start time range for a train can be determined based on the experience of the planners or acquired from existing timetables [23].

Description of a Train Timetable in the Form of a Time-Space Network.
A description of a train timetable in the form of a time-space network can accurately reflect various events and their occurrence times during train operations [16].A time-space network is the combination of the physical network and the time horizon.We use Figure 3 to demonstrate the time-space network formulation, in which the physical route of train  consists of four nodes/stations and three links/sections.The horizontal axis corresponds to discrete time, where the time horizon is represented as [,  + 19] with equal one-minute intervals.The vertical axis corresponds to space and represents stations and sections.The time-space vertex represents the arrival or departure state of train  at a physical node at the current time, and the time-space arc represents the movement of train  in stations or sections with the entering and leaving times.
The time-space network starts from the virtual departure node .Different virtual departure arcs are generated based on the feasible start time range.We define the section connection arcs to imply the section travel times of train .When train  passes by a station, we define the station passing arcs to imply it.When train  stops at a station, additional starting and stopping times and dwell times are incurred.
Additional starting and stopping times are calculated in accordance with the stop statuses of train  at the two adjacent stations and added to the section connection arcs.The dwell times at stations are required to satisfy certain maximum and minimum dwell time requirements.The purpose of imposing a maximum dwell time is to tighten the feasible space of the optimization model and produce a more reasonable train timetable.The minimum dwell time is the minimum time required for passenger boarding, alighting, transfer, and other operations.We define the station dwell arcs to imply the minimum dwell times of train  at stations.Any additional dwell time beyond the minimum dwell time is represented by one or more additional waiting arcs.The time duration of each additional waiting arc is fixed to one minute in this paper.The network ends at the virtual arrival node .Thus, the path from  to  in the time-space network corresponds to a feasible timetable for train .Each train  has multiple paths that constitute the path set   ⊂ , where  is the set of all paths.
For example, the path  ∈   , as shown by the black solid lines in Figure 3, starts from station A at time  + 2.
Traveling with the section connection arc, which is subject to the section pure travel time and additional starting and stopping times, train  arrives at station B at time  + 5.After dwelling for one time interval, train  departs from station B at time  + 6. Train  passes by station C at time  + 9 after traveling with the section connection arc between station B and station C, which is subject to the section pure travel time and additional starting times.Likewise, train  reaches the destination at time +11.In this manner, the train scheduling problem can be transformed into a routing problem in the time-space network.
We redesign the train stop schedule plan when optimizing the train timetable, unlike in the traditional TTP, in which the train stop pattern is predetermined.Thus, both station passing arcs and station dwell arcs are generated for the train path at every station.During the optimization process, the train can choose to either stop at or pass by a station, depending on demand.For stations at which trains must stop for technical reasons, only station dwell arcs and additional waiting arcs are generated.The existence of additional starting and stopping times and the variable stop pattern produce unconnected arcs in the time-space network.To address this issue, we add an associated label (   ,    ) to each arc  ∈ , where  is the set of arcs.When  is a section arc,    indicates whether  stops at the ending station of this section, where a value of 1 represents yes and a value of 0 represents no, and    indicates whether  stops at the starting station of that section.When  is a station arc,    indicates whether  stops at this station, and    indicates whether  stops at the preceding adjacent station.
As shown in Figure 4, neither C nor D is a must-stop station.The red path passes by station C. If the red path chooses to pass by station D (as on path ), the section operation arc is calculated based on the pure section travel time.The associated label is (0,0).If the red path chooses to stop at station D (as on path   ), the section operation arc is calculated using the pure section travel time and the additional stopping time at station D. The associated label is (1,0).The blue path stops at station C. If the blue path chooses to pass by station D (as on path ), the section operation arc is calculated using the pure section travel time and the additional starting time at station C. The associated label is (0,1).If the blue path chooses to stop at station D (as on path   ), the section operation arc is calculated using the pure section travel time, the additional starting time at station C, and the additional stopping time at station D. The associated label is (1,1).When selecting a time-space path, we choose the next arc from the subsequent connected arcs for which the    label is equal to the    label of the current arc.

Model Construction
3.1.Objective Function.We transfer the costs of the timespace nodes to the adjacent arcs.Thus, the cost of a time-space path is equal to the total cost of the included arcs.We define the objective function with the goal of minimizing this cost.Let   be the predefined preferred departure time of train  from its starting station.  is the actual departure time from the starting station on path  . is the halfwidth of the permitted departure time range at the starting station.   and    are the earliest departure time and latest departure time of train  from its starting station.As noted in Section 2.3, the preferred start times of the trains are predefined.The permitted variations of the departure time fall in the range [  −,   +].The cost of a virtual departure arc in this range is zero.Paths that lie outside this range are allowed but the punishment coefficient,   1 /min, is added to achieve maximum control of the deviation of the departure time.The cost of the deviation is calculated and added to the cost of the virtual departure arc.In the example shown in Figure 5, the actual departure time of path  ∈   falls outside the allowed fluctuation range, and the cost of the corresponding virtual departure arc is The stopping of a train at a station will cause an increase in the passenger travel time because it generates a dwell time and additional starting and stopping times.Our goal is to reduce the number of stops and the dwell time.The number of stops is associated with the punishment coefficient   2 /.The punishment coefficient for the dwell time   3 /min includes the punishment for both the station dwell arc and any corresponding additional waiting arcs.
A punishment of   4 /min is applied for additional waiting arcs to reduce but not preclude the occurrence of train overtaking.
The costs of section arcs, station passing arcs, and virtual arrival arcs are 0.
The cost   of a virtual departure arc is   is the station dwell arc set.The cost    of a station dwell arc (∀ ∈   ) is where    and    are the start time and end time of arc .  is the station additional waiting arc set.The cost    of an additional waiting arc (∀ ∈   ) is The cost   of a time-space path  is The objective function is represented as where the decision variable   indicates whether a certain path  appears in the optimal solution.It is equal to 1 if and only if path  is selected in the optimal solution and is equal to 0 otherwise.

Constraints. The network flow constraint,
represents the selection of only one path by each train.The arrival safety headway constraint, represents a constraint on the safety headway for consecutive arrivals of trains that travel in the same direction. ∈  is the index of sections, where  is the set of sections.  ∈  sec  is the index of arcs, where  sec  is the section arc set.() and () are the starting station and ending station of section .   is the binary parameter, which is equal to 1 if arc  belongs to path  and is equal to 0 otherwise.  () is the safety headway for trains that arrive at station ().
The departure safety headway constraint, represents a constraint on the safety headway for consecutive departures of trains that travel in the same direction.  () is the safety headway for trains that depart from station ().
The overtaking constraint, represents the constraint that trains are not allowed to overtake other trains that travel in the same direction on a section.
The station capacity constraint is where   is the number of sidetracks at station  and    is a set of virtual station arcs, including station passing arcs, station dwell arcs, and additional waiting arcs.   is used to represent all possible operations of a train with respect to station .The start time of an arc is the time at which the train enters the station from the previous section, and the end time of the arc is the time at which the train leaves the station and enters the next section.
The OD service frequency constraint, represents the constraint that the number of trains that serve a given station OD pair should satisfy the passenger service demand of this pair. ,   is the coupled-stop index, which is equal to 1 if path  stops at both station  and station   and is equal to 0 otherwise.The station service frequency constraint, represents the constraint that the number of trains that stop and start at a given station during a certain period should satisfy the service frequency requirement of the node to which this station belongs.  is the station service frequency of station , which corresponds to the station grade, and    is the stop index, which is equal to 1 if path  stops at station  and is equal to 0 otherwise.
The departure time constraint is The station dwell time constraint is where  max where  sec  is the pure section travel time of section , and    and    are the additional starting time and stopping time of section .
As shown in Figure 6, during the generation of the section arcs, the additional starting time and stopping time depend on the stopping statuses of the train at the two adjacent stations.These times are determined based on the results of the traction calculations.
The station travel time constraint is In the time-space network, the start time and end time of a station passing arc are equivalent.The end time of a station dwell arc is the sum of the start time and the minimum dwell time at this station.Any additional dwell time beyond the minimum dwell time is represented by one or more additional waiting arcs.
The decision variables are Model M1 (a generic model with passenger service demands and station service demands) consists of the objective function (6), which is subject to constraints (7)- (18).Constraints ( 14)-( 17) are used to construct the time-space network.By limiting the departure time range at the starting station [( 14)] and the maximum and minimum station dwell times [( 15)], redundant time-space nodes and arcs can be effectively reduced.By defining the station dwell arcs and additional waiting arcs to imply the station dwell times [( 17)], the number of station arcs can be effectively reduced.Thus, the complexity of the model and the scale of the time-space network are manageable.Model M1 is an integer programming model with a linear objective function and linear constraints.We apply an extended branch-and-price algorithm to solve this model.

Extended Branch-and-Price Algorithm
As the numbers of stations and trains increase, the computational scale of model M1 also increases.The problem becomes an integer linear programming problem on an extremely large scale.The number of generated feasible train paths, i.e., the number of decision variables, exceeds the number of constraints.Therefore, we propose a modified branch-andprice algorithm to solve this problem.
The branch-and-price algorithm employs a column generation algorithm to solve the linear relaxation of the nodes in the entire branch-and-bound tree, derives a reasonable lowerbound value for the original problem, and uses the branchand-bound algorithm to solve the integer programming problem.The underlying idea of the column generation algorithm is to reduce the enumeration of all columns to only one set of feasible paths for the original problem in terms of a large-scale linear problem whose scale is considerably smaller than that of the original problem.A restricted master problem (RMP) is generated.Dual variables are derived by solving the RMP.Better paths are generated based on the PP and are added to the RMP.The algorithm iteratively alternates between the PP and the RMP until the best solution to the linear programming problem has been derived.

Restricted Master Problem.
Model M1 is an integer linear programming model.In particular, ( 14)-( 17) are used to construct the time-space network and are not applied to the RMP.The integer decision variables in (18) are relaxed to continuous variables, and the subset  0 is selected from the path set .

Pricing Problem.
Solving the PP is a key step of the branch-and-price algorithm.The purpose is to seek the feasible solution that can reduce the objective function, by adding the column with the negative reduced cost to the RMP.The reduced cost of any column can be regarded as the price of this column.
,  ,, ,  ,, ,  , ,  , ,  ,  , and   are dual variables that correspond to ( 7), ( 8), ( 9), ( 10), ( 11), (12), and (13), respectively.They represent shadow prices that are associated with time-space nodes, arcs, and other resources.For a certain train path  ∈   , the reduced cost is If   ≥ 0, then all train paths optimally satisfy the original problem, and the optimal solution to the linear programming problem is derived.If   < 0, further optimization is required.Considering the characteristics of the time-space network, the PP can be transformed into a shortest-path problem.The reduced cost of an entire time-space path can be calculated based on the costs and weights of the arcs that belong to this path.The weights of the arcs are updated using the dual variables derived from the RMP.The path with the negative reduced cost constitutes a new column to be added to the RMP [16].As shown in (21), the weights of arcs can be negative; in general, the problem can be solved using the Bellman-Ford algorithm or the SPFA.

Calculation of the Shortest Path.
In general, once the weights of all time-space arcs have been determined, the PP can be solved using a shortest-path algorithm.Due to (12), a conventional shortest-path algorithm is not applicable in this case.Equation ( 12) is an OD service frequency constraint that is related to any pair of different stop stations on a train path.Dual variables cannot be directly employed to update the arc weights; instead, we identify the stopping statuses along the entire path and then update the weights of the path using the dual variables that correspond to each pair of different stop stations.Due to the large scale of the time-space network, a search of all paths would be excessively time consuming and significantly affect the efficiency of the algorithm.Therefore, we propose a new method based on the A * algorithm and the reversed SPFA.
The A * algorithm can be regarded as a heuristic algorithm in which the estimation function  * () = () + ℎ * () is used to conduct a network search.This algorithm has been widely applied in searches for an optimal path.In the time-space network,  * () is the estimated cost of a path that starts from the virtual departure node , passes through the node , and arrives at the virtual arrival node .() is the actual cost of the path from  to . ℎ * () is the estimated cost of the best path from  to . ℎ() is the actual cost of the best path from  to .The key to the A * algorithm is to select an appropriate estimation function.Since () is the actual cost of the path from the virtual departure node to the node  along the selected path, the problem involves the selection of an appropriate heuristic function ℎ * () to achieve a quick and exhaustive search.
If the estimation function satisfies the admissibility condition [i.e., if the number of nodes subsequent to each node in the network is limited and ℎ * () ≤ ℎ() is satisfied for each node ], then the best path for the original problem can be derived via the A * algorithm [29].Therefore, we reverse the original time-space network.Then, ℎ * () can be derived as the cost of the shortest path from the virtual arrival node  to a random node , as calculated via the SPFA.If ( 12) is not considered, then ℎ * () = ℎ(), and the search for the best path will be conducted exactly along the shortest path.All trains are assumed to stop at all stations after node .We construct pairs from each station   after node  and from each stop station  before node  and add the corresponding dual variables  ,  to ℎ * ().The ℎ * () derived in this manner must satisfy ℎ * () ≤ ℎ().As noted in Section 2.4, the number of nodes subsequent to each node in the time-space network is limited, and, thus, the admissibility condition is satisfied.Therefore, the A * algorithm can be used to derive the shortest path for the original problem.
The main procedure for solving the PP is as follows: Step 1.The weights of the time-space arcs are updated using the dual variables derived from the RMP.The set  min is used to store the calculated shortest path for each train.  represents the set of time-space paths for train .A status priority queue {(, (),  * ())} is constructed.() denotes the number of times node  leaves the queue.() denotes the set of stop stations prior to node . min = ⌀ and  = 1.
Step 2.    is derived through a reversal of   ; in other words, all time-space arcs in   are reversed.In addition,   =  and   = .The shortest path from node   in    to any node   =  is calculated using the SPFA.The result can also be regarded as the cost of the shortest path from node  in   to .This cost is recorded as ℎ  ().The calculation proceeds to Step 3.
Step 4. The status priority queue is sorted in increasing order of the estimated cost  * ().The lowest status (, (),  * ()) is removed from the queue.() = ()+1.If the node  is equal to  and () = 1, then the shortest path for train  is the output, and the calculation proceeds to Step 6; otherwise, it proceeds to Step 5.
Step 5.The statuses of all nodes subsequent to node  are calculated using a status transfer function and are added to the status priority queue.V represents an adjacent node subsequent to node  .[][V] represents the weights of the arcs of the adjacent nodes (, V).The status transfer function is (V) is derived from the set () and the stop status of the current arc (, V).If the current arc is a section arc and the associated train stops at the ending station of the section, then we select each station  from () to form an OD pair with V.The corresponding dual variable  ,V is added to   (V), and (V) = ()+V; otherwise,   (V) = 0 and (V) = ().The train is assumed to stop at all stations after node V.Each station  from the set (V) and each station   from the stations after node V form an OD pair, and the dual variable  ,  , which corresponds to this pair, is added to the ℎ  (V) calculated in Step 2. The result [ℎ * (V)] is the estimated cost of the shortest path.
If (V) > 1, then the calculated status is not added to the status priority queue.The calculation proceeds to Step 4.
Step 6.The shortest path for train  is added to  min . =  + 1.If  > , the calculation proceeds to Step 7; otherwise, it proceeds to Step 2.
Step 7. All train paths in  min are ordered according to their costs.The path with the lowest cost is added to the RMP.The calculation ends.

Rapid Branching Strategy.
We use a rapid branching strategy, in which only one branch is taken.The selected path, which is represented by a fractional variable, will directly appear in the next iteration.The corresponding train will choose and is allowed only to choose this path.The beam search algorithm is used for branching.The derived fractional variables are ordered from greatest to least according to their corresponding coefficients in the objective function.At each branching, only a certain number of promising variables are selected; this number is referred to as the beam width.This method has been demonstrated to be effective for a similar integer programming problem [30].
Figure 7 illustrates the workflow of the extended branch-and-price algorithm, where  is the termination parameter.

Acceleration Strategies 4.5.1. Initial Solution Iteration.
In the branch-and-price algorithm, new columns generated by solving the PP are iteratively added to the path set considered in the RMP to improve the objective value and to achieve the optimal result.As the iterations proceed, the scale of the path set considered in the RMP increases.An excessively large problem scale will affect the efficiency of the algorithm.Thus, strategies are necessary to limit the increase in the scale of the path set.In general, a reasonable initial feasible solution facilitates rapid convergence of the algorithm.
For these reasons, we designed an initial solution iteration strategy.As shown by the dotted box in Figure 7, once an integer solution is derived in the calculation process and the numerical difference between the corresponding objective function value and the current UB exceeds a certain threshold, the integer solution will be used as a new initial Remove k from Br(w  ) Solve the linear relaxation of node w  using the Act(w) = Ø Construct time-space networks.Set LB = -∞, UB = +∞.Generate an initial solution ０ 0 as the root node of the search tree and add it to the active node set Act(w).
(UB-LB)/UB ≤  Sort Act(w) according to the depth-first strategy.Sort the set in ascending order by the min value for nodes of the same depth.Update LB if the min value of the set is greater than LB.Select the first node w  of the set and delete it from Act(w).
The active node is not the root node.Br(Ｑ  ) is the potential branching set.
Select a potential branching k from Br(Ｑ  ).
Generate a new additional cut Ｒ Ｅ = 1 as the constraint and add it to the model.
＇ ＧＣＨ = Ø Solve the RMP using CPLEX and obtain the dual variables.If the results ８ 0 are all integers and the objective function value ： 0 is less than UB, update UB. ４ ＦＣＧ is the threshold used for initial solution iteration strategy.
Solve the PP.Update time-space arcs using dual variables.Determine the shortest path p using the ！ * algorithm and the SPFA.If  Ｊ < 0, add the path p to ＇ ＧＣＨ .
Sort ＇ ＧＣＨ according to the determined path costs.
Select the lowest-cost path and add it to the RMP.feasible solution to restart the execution of the entire branchand-price algorithm.The purpose of specifying this threshold is to avoid consuming excessive time when frequently solving the linear relaxation of the root node.This initial solution iteration strategy improves the calculation efficiency and effectively reduces the processing scale.

Delayed Constraints.
The delayed constraint strategy is specifically related to ( 8)- (11).In (8), as shown in Figure 8, (, ) represents the section arc  and its corresponding path .If path  exists in the current path set of the RMP, the time intervals between arriving at station C on path  and all adjacent paths must be greater than or equal to    .The dual variable of path  related to ( 8) is  ,, = 0.However, the weight of another arc (such as   ) at station C in the time range [   ,    +    ) is not affected by path .If path   is the shortest path selected by solving the PP and is added to the RMP, then path  and path   must satisfy the constraint   +    ≤ 1 according to (8), and the dual variable of path  related to (8) may not be zero.In the calculation, the value of path   may be less than 1 or even 0.In this case, the added path does not serve its ideal function in reducing the value of the objective function.Similarly, a time-space arc at station C in the time range (   −    ,    ] that is selected in the next iteration can conflict with path .This constraint is referred to as a delayed constraint; the effect of a constraint in the current iteration can be shifted to the next iteration.To address these delayed constraints, we add a delay weight to each time-space arc.The value of the delay weight is determined based on the average value of the paths over several recent iterations (e.g., seven iterations) in the RMP.When the time-space network is updated using dual variables, the delay weight attributes of the paths are added to all selected arcs at station C in the time range (   −    ,    +    ).Delayed constraints related to (9)- (11) are handled in a manner similar to that applied for (8).
Applying the delayed constraint strategy to ( 8)-( 11) effectively improves the efficiency of the branch-and-price algorithm in solving the linear relaxation of the nodes.

Column Removal.
Due to the rapid branching strategy, in which a selected branching path is treated as a constraint that is added to the model, the train associated with a selected branching path must select only this path.All other paths related to this train in the current path set can be deleted.Any paths in the path set that have operational conflicts with the branching path [( 8)- (11)] can be deleted.This train can be skipped in the calculation of the PP.When the arc weights are updated, the weights of all arcs that conflict with the branching path are set to +∞.Thus, these conflicting arcs are avoided in the calculation of the shortest path.This column removal strategy can effectively control the processing scale and improve the efficiency of the algorithm.

Case Study
The effectiveness of the proposed model and algorithm was demonstrated using train data from the Wuhan-Guangzhou railroad.The algorithm was implemented using a Windows 7 computer with an Intel Xeon E3 3.30-GHz CPU and 8 GB of RAM.The programming tools were Microsoft Visual Studio 2010 with the C# language and ILOG CPLEX 12.7.1.0.
The Wuhan-Guangzhou high-speed railroad is an abbreviated name for the Wuhan-Guangzhou section of the Beijing-Guangzhou-Shenzhen-Hong Kong high-speed railroad.The length of the railroad is 968 km (601 miles), and the route has a total of 18 stations.The Wulongquan East station is an overtaking station that does not provide passenger transportation.The Lechang East station officially opened in May 2017 and was treated as a "passed-by" station in this experiment.All trains that run on this route are classified as Grade G and operate at a speed of 300 km/h.The operation of Grade D trains, with a speed of 250 km/h, was officially suspended on January 10, 2016.Due to the time designated for maintenance and testing, the actual passenger service time ranges from 6:30 AM to 0:00 AM.Additional infrastructure information is available on the website: https://en.wikipedia.org/wiki/Wuhan-Guangzhouhigh-speed railway.
Table 1 lists the station names and the number of tracks at each station on the Wuhan-Guangzhou railroad.Table 2 lists the section lengths, pure section travel times, and additional starting and stopping times.
To explicitly show the train timetabling results, the calculation was performed using trains that travel in the downward direction as an example.
Table 3 lists the predefined earliest and latest departure times for the trains at their starting stations.Table 4 lists the OD service frequencies inferred from the daily average passenger traffic data provided by the ticketing department.The grades of the station nodes and the corresponding service frequencies determined based on the station size, properties, infrastructure, railroad hub position, city population, economic data, and traffic data are shown in for trains at all stations were set to 2 min, 7 min, 2 min, and 3 min, respectively.
Based on these data and parameters, calculations were performed using model M1 and the extended branch-andprice algorithm.To achieve a balance between efficiency and accuracy, we applied a termination rule in the form of (23).When the difference between the UB and the LB was within the allowable range, the calculation was terminated.An approximately globally optimal solution was achieved.We performed 30 sets of experiments, and the average computation time of the extended branch-and-price algorithm was 199 s.
( − )  ≤ .As shown in Figure 9, during the process of solving the linear relaxation of the root node using the column generation algorithm, the value of the objective function initially decreases rapidly and then gradually tends toward stability as the calculation time increases.
As Figure 10 similarly illustrates, as the calculation time increases, both the average number of stops and the average dwell time decrease and gradually tend toward stability.
Figure 11  can ensure that the majority of trains depart within the predefined departure time ranges.
Table 6 presents a comparison between the real train timetable and the optimized train timetable in terms of the average travel speed, the average number of stops at stations, and the average station dwell time.The real case is the real train timetable for the Wuhan-Guangzhou highspeed railroad in February 2016 (refer to Dataset S1 in the Supplementary Materials), which was also employed as the initial solution for the optimal case.The timetable specified the operations of 107 downward-direction trains (from 6:30 AM to 0:00 AM).The minimum station dwell time was 2 min, and the maximum dwell time was 28 min.The average As shown in Table 6, the average number of stops at stations and the average dwell time are reduced by 20.3% and 43.2%, respectively, in the optimized train timetable.The average travel speed is also significantly increased, and the total defined cost function of the trains is reduced by 38.2%.These results indicate that our proposed model and algorithm can effectively improve the quality of a constructed train timetable and the travel efficiency for passengers.Due to the more reasonable design of the train stop schedule plan, the number of occurrences of overtaking is reduced from 40 to 0, and the maximum dwell time is reduced by 78.6%.These improvements effectively reduce the workload for station operations and improve the operational capability of the station infrastructure.The total computation time of the optimal case was 199 s, and the gap value was 1.9%.The results demonstrate the efficiency of the proposed algorithm.We also tested the efficiency of the algorithm without the three acceleration strategies.In this case, the gap value was 50% when the calculation ran for 6 min.This result demonstrates that the three proposed acceleration strategies are effective in accelerating the algorithm.Figure 12 shows the train timetable from 6:30 AM to 0:00 AM based on the optimal case.

Conclusions
In this study, we developed a high-speed railroad train timetable optimization model based on a time-space network representation.The goal of the model is to minimize the deviations of the train start times, the total number of train stops, and the dwell time at stations, which are subject to the constraints related to the OD service frequency, train scheduling, and station service frequency.The train stop schedule plan is redesigned during the process of optimizing the train timetable.An experiment using data from the Wuhan-Guangzhou high-speed railroad demonstrated that the proposed model and algorithm can effectively reduce the number of stops at stations and the station dwell time to improve the quality of the constructed train timetables and the travel efficiency for passengers.First, we discussed the construction of a time-space network to represent a highspeed railroad environment.We used an associated labeling method to handle unconnected arcs.Second, we employed an extended branch-and-price algorithm to calculate the model.We also proposed a new method based on the A * algorithm and the SPFA to search for the shortest path in the unconnected and dynamic time-space network when solving the PP.Last, we designed three acceleration strategies: initial solution iteration, delayed constraints, and column removal.
Our experiment demonstrated that these three strategies are effective in accelerating the algorithm.
Our future study will pursue three research directions.First, only constraints concerning the limited number of sidetracks at a station were considered when optimizing the train timetable; constraints related to routing schedules and track use at stations were not considered.Thus, addition work is needed to achieve the simultaneous optimization of the high-speed railroad train timetable and the station operation plan.Second, we plan to explore the impact of the OD/station service frequency on passenger demands.Last, we will perform sensitivity analyses to assess the effects of the punishment coefficients on the objective function.

Data Availability
The data used to support the findings of this study are included within the article and the Supplementary Materials.

Figure 1 :
Figure 1: Overview of the proposed high-speed railroad train timetable optimization method.

Figure 6 :
Figure 6: Illustration of the calculation of additional starting and stopping times.

𝑖
and  min  are the maximum dwell time and minimum dwell time of trains at station , and    is the dwell time at station  on path .The section travel time constraint is    =    +  sec  +  () :   =1    +  () :   =1    ∀, ∀ ∈  sec  , is used for branching.Sort the derived fractional variables in descending order by the target function coefficients.Select a certain number of Generate a new active node Ｌ  and add it to Act(w).The promising variables and add them to Br(Ｌ  ).

Figure 7 :
Figure 7: Workflow of the extended branch-and-price algorithm.

Figure 8 :
Figure 8: Illustration of the delayed constraint strategy.

Figure 9 :
Figure 9: Variation in the objective function value of the root node over time.

Figure 10 :
Figure 10: Variation in the parameters of the root node over time.

Figure 11 :
Figure 11: Preferred and actual numbers of train departures for each hour.

Figure 12 :
Figure 12: Optimized train timetable for the optimal case.
Li et

Table 5 .
All data are provided by the State Key Laboratory of Rail Traffic Control

Table 1 :
Station names and number of tracks.

Table 3 :
Earliest and latest departure times at starting stations.

Table 4 :
OD service frequencies of sections that operate in the downward direction.

Table 5 :
Classification of node grades and corresponding station service frequencies.

Table 6 :
Comparison between the real train timetable and the optimized train timetable.speed was 223.4 km/h.The optimal case corresponds to the optimized train timetable, in which the number of trains, train types, and paths are the same as those in the real case.The gap value was calculated as follows: travel