^{1}

^{2}

^{1}

^{2}

^{1}

^{1}

^{2}

^{1}

^{2}

As a critical foundation for train traffic management, a train stop plan is associated with several other plans in high-speed railway train operation strategies. The current approach to train stop planning in China is based primarily on passenger demand volume information and the preset high-speed railway station level. With the goal of efficiently optimising the stop plan, this study proposes a novel method that uses machine learning techniques without a predetermined hypothesis and a complex solution algorithm. Clustering techniques are applied to assess the features of the service nodes (e.g., the station level). A modified Markov decision process (MDP) is conducted to express the entire stop plan optimisation process considering several constraints (service frequency at stations and number of train stops). A restrained MDP-based stop plan model is formulated, and a numerical experiment is conducted to demonstrate the performance of the proposed approach with real-world train operation data collected from the Beijing-Shanghai high-speed railway.

In most countries, high-speed railway (HSR) is significant in daily life owing to its reliability, safety, low emissions, and energy savings. Due to increasing passenger demand and the growing high-speed railway network scale, the transport organisation becomes increasingly complex. However, train operation management must be maintained at an acceptable efficiency level. The train stop plan is a key element in the operation plan for satisfying the increasing passenger demand and reducing the operational costs of the railway company. The stop plan can impact the frequency of train service and the load of trains at each station, which directly influences transportation resource utilisation. Assad [

With a limited train fleet size and station capacity, it is critical for the stop plan to simultaneously consider passenger demand and train stop patterns, with the goal of achieving service-demand equilibrium with the given transportation resources. A contradiction always exists in the TSP problem: a sufficient number of train stops are required for serving passengers along the railway line, but too many train stops result in reduced operating efficiency and low resource utility. Railway companies must find an effective means of balancing transportation costs and varying passenger demand.

The TSP provides a critical foundation for a complete train timetable for the entire railway network, especially for railway systems that cannot compute the timetable directly due to the scale of operations. In China, the TSP is practically determined based on passenger volume prediction and the preset station level, an indicator of the significance of each station. This is a convenient way to quickly find a solution; the subjective station levels may implicitly include abundant information. However, with changing society and economics along the railway, this parameter is not updated fast enough and can lead to an unreasonable train stop plan solution. As a basic decision support plan, the train stop plan is typically underestimated in the literature (see Section

This study makes contributions in the following areas: (i) railway properties and city features are listed as input parameters to better reflect environmental influence and improve train stop plan quality, (ii) a data-mining technique is applied to explore the station level through quantitative analysis of effective features using a dataset from the Beijing-Shanghai high-speed railway, and (iii) a restrained Markov decision process (RMDP) is proposed to find the optimal policy to achieve a better train stop plan.

The remainder of this paper is organised as follows: Section

The hierarchical structure of the train operation plan comprises the following sequential subproblems: train operation zone, stop schedule (including train stop planning and train time tabling), rolling stock, and crew scheduling [

However, the stop plan is always simplified or taken as one known input condition. Such treatment, albeit idealistic, may ignore the influence of the stop plan on actual operation. The stop plan is also an important subproblem related to passenger service quality and is influenced by a series of factors (e.g., stop stations and station service frequency). Thus, the TSP problem must be considered.

A few studies have focused on the pure TSP problem. Li et al. [

The solution algorithm design is another critical component of the optimisation problem, with complex factors, scale of variables, and constraints. With regard to optimisation model studies, this type of problem is always NP hard, even with certain idealistic parameter assumptions that lead to unstudied variables without empirical analysis and time-consuming searching for appropriate solutions for the TSP problem [

The reinforcement learning approach is an alternative to the optimisation method for solving TSP and is widely used in decision problems [

With station location, station capacity, passenger demand, train operational zones, train travel distance, fleet size, number of stops, skip-stop strategy, and train type as railway operation inputs and administrative level, population, and GDP of cities as environmental inputs, our study aims to generate a capacity-equilibrium train stop plan with the best trade-off between quality of passenger transportation service and rail operation cost. A stop plan regulates the stop pattern of each train in a railway line, stopping or not stopping for passenger boarding/alighting at each station.

Train stop planning must always select some stations for each train based on passenger demand. Although increased train stops provide better passenger service, they also lead to reduced transportation efficiency (increase in travel time, less train throughput, and mismatching service supply). We introduce a clustering analysis and a modified Markov decision process-based framework to consider the railway and its environmental factors to determine a coordinated train stop plan solution that uses reasonable transportation resources and adequately satisfies the passenger demand.

To solve the TSP problem, a machine learning-based two-stage solution framework is developed to gain insights into the impacts of the station characteristics and formulate a new model to achieve the optimal stop plan. In the first stage, unsupervised clustering analysis is applied to explore the railway properties and service-node features along the railway line to classify the stations; this is the primary input of the TSP problem. In the second stage, a restrained Markov decision process (RMDP) is used to optimise the high-speed railway stop plan. The framework is shown in Figure

Framework for the stop plan.

The purpose of clustering analysis in solving the TSP problem is to provide railway station-level classification in a multifactorial manner that takes more environmental influences (features) into account, rather than considering the station level based only on its scale and location in the railway network. Fuzzy c-means clustering (FCM, see [

FCM is used to divide the station samples into C clusters. Each cluster is characterised by its sample mean, known as the centroid. The model objective is to minimise the summation of the weighted distance between each sample and the centroid of each cluster, as in equation (_{ij} is the degree of membership of

Fuzzy partitioning is conducted through an iterative optimisation of the objective function shown in equation (

The cluster centroid

The iterative algorithm terminates when _{FCM}. The aforementioned procedure does not specify the number of clusters. The optimal number of clusters in our study was determined based on the Xie–Beni coefficient [_{j} of the cluster

The station level is ranged by the average scores: the greater the score, the higher the station level.

The Markov decision process is a significant machine learning concept in artificial intelligence. It has been widely used to formulate many decision-making problems with essential elements: state, action, policy, and reward [

The decision to be made in the TSP problem is the selection of a series of feasible service-node (station) sequences for each train individually. Each sequence is taken as a stop scheme for a train and as an action. The action sequences form the MDP decision chain. Each state transition decision is made depending only on the current state and is not related to earlier states in the chain. With regard to the Markov property, the stop plan decision-making process can be formulated based on the standard MDP [

In this study, there is a five-tuple

As shown in Figure

The utility function for discounted Markov decision problems is defined as

The parameter

The system is defined to find a sequence of actions that produce an optimal policy

Generally, actions are the triggers related to changing states. In each epoch, any action will induce the corresponding state. However, in this study, we defined actions as the number of train stops. Thus, action set

For the dataset we collected, the Beijing-Shanghai high-speed railway has an uncommon station Nanjing South. All trains must stop at this station for necessary technical operational work. Hence, the minimum number of stops for a train in this study is shortened from 3 to

The probability density of the stop-times distribution can be obtained using statistical analysis methods. After analysing the trains collected from real-world operation data, the results showed that the actions set followed the Gaussian distribution with

In the TSP problem, each action is attributed to a set of stop schemes resulting from the diversity of selections for all stations with the action parameter: the train stop-times.

The set of states is a key constituent in the decision process. In this section, we combine the particularity of the TSP problem with the characteristics of the state space of the MDP to achieve the adaptive state set for each epoch.

Let

During the decision process, there is a set of states that can be triggered by different actions in each epoch. In the TSP problem, the state set consists of many train stop schemes related to the total number of possible combinations of stations. However, not every probable stop scheme adapts to the current state. To simplify the range of the state set, we consider constraints based on the current state and the action probability distribution. With regard to the filter process, the appropriate state set varies in each epoch.

Thus, several critical constraints must be built. For the number of stops and service frequency restrictions, the constraints can be divided into two sides:

The vertical constraints comprise two parts: one is the maximum number of stops at each station (denoted as

The transverse constraint is the maximum number of stops for each train. The number of stops for each train (denoted as

State filter process with constraints.

Generally, a greater number of stops at stations and for trains increases passenger convenience. However, it always leads to a higher capacity occupation, increased operation costs, and increased total travel time. Thus, both constraints are important for balancing the operation cost and service quality.

For the quality of passenger service, passenger demand is an indispensable factor influencing the maximum number of stops at each station. Hence, we must consider another parameter: the train stop rate of station

We can obtain the maximum stop time for each station

For the capacity-equilibrium utilisation, the maximum number of stops of each train plays a significant role in the operation. It is always set pragmatically, considering the train type and the total train quantity.

In each epoch, the state parameters remember the temporal number of stops for each station and the temporal combined number of stop schemes. All state parameters are updated with the state transition until

Each action related to a substate set including several stop schemes satisfies the constraints. However, only one state is related to an action during a common MDP. Thus, we selected one stop scheme from the substate set for each action. To maintain the system performance, it is effective to calculate the instant rewards for an action to choose the best stop scheme from all the alternatives in the substate set. Denote

Generation process for adaptive state set in each epoch.

The skip-stop equilibrium of the stop scheme for each train is used to describe the reward. There are two kinds of rewards: assigned rewards and instant rewards. The assigned reward can be calculated according to the different stop schemes from the adaptive state set. The instant reward is calculated based on the substate set. Let

To obtain enhanced performance of the skip-stop equilibrium distribution, it is necessary to consider that each train should skip stops along the line and also avoid stopping stations continuously. The reward (assigned reward or instant reward) of the train is expressed as

Different actions trigger different new states. It is necessary to design the decision process to calculate the state transition probabilities. As in the MDP theory states, the new state achieved after taking an action depends only on the previous state and the decided action. It is not affected by previous states [

With regard to the probability distribution of actions and the related stop schemes, the state transition process can be expressed as the transition probability:

State transition process.

We complete the structure of the model based on the previous steps to find the decision chain with optimal policy

The future reward can be calculated iteratively using the equilibrium index of each epoch from the actions. Thus, the system future reward can be calculated by the assigned reward of different actions in each epoch. The optimal policy is attributed to the decision chain with the maximum future reward. Denoting the last state as

Our train stop planning experiments are based on practical operation data of the Beijing-Shanghai high-speed railway from the Railway Passenger Transport Management Information System from October to December 2017. The rail line has a total length of 1318 km and services 24 high-speed railway stations. Fifty-six representative features were extracted from the collected dataset. Descriptions of the features are listed in Table

Extracted features of operation data.

No. | Feature | Notes |
---|---|---|

1 | Date | October 1st ∼ December 31st |

2 | Order of trains | Order for a train to distinguish the others |

3 | Train operational zones | Train with different original stations or destination stations |

4 | Fleet size | The number of trains for each operational zones |

5 | Train type | Normal trains and fast trains |

6 | Number of stops | Number of stop of a train |

7 | Skip-stop strategy | Collected service-nodes sequence for a train |

8 | Train travel distance | Train running distance |

9 ∼ 32 | Passenger demand (boarding) | Boarding passenger flow at a station |

33 ∼ 56 | Passenger demand (leaving) | Leaving passenger flow at a station |

To determine the station level in the first stage of the solution framework, service-node features including administrative levels, population, GDP, and distance are considered in the station-level presetting analysis. The values of these features are shown in Table

Extracted features of service nodes.

City | ID | Station | Ab. | Current level | Population (thousand) | GDP (billion) | Distance from next station (km) |
---|---|---|---|---|---|---|---|

Beijing | Beijing South | BJS | 4 | 21520 | 2133.1 | 60 | |

Langfang | Langfang | LF | 2 | 4522 | 205.6 | 62 | |

Tianjin | Tianjin West | TJW | 4 | 15170 | 1572.7 | 0 | |

Tianjin South | TJS | 4 | 15170 | 1572.7 | 88 | ||

Cangzhou | Cangzhou West | CZW | 2 | 7375 | 313.34 | 104 | |

Dezhou | Dezhou East | DZE | 2 | 5705 | 259.61 | 92 | |

Jinan | Jinan West | JNW | 3 | 7067 | 577.06 | 59 | |

Taian | Taian | TA | 2 | 5581 | 300.22 | 70 | |

Qufu | Qufu East | QFE | 1 | 6392 | 21.585 | 56 | |

Tengzhou | Tengzhou East | TZE | 1 | 1700 | 98.175 | 36 | |

Zaozhuang | Zaozhuang | ZZ | 1 | 3831 | 198.013 | 65 | |

Xuzhou | Xuzhou East | XZE | 2 | 8580.5 | 496.391 | 68 | |

Suzhou | Suzhou East | SZE | 2 | 5486 | 112.61 | 88 | |

Bengbu | Bengbu South | BBS | 2 | 3258 | 110.84 | 54 | |

Dingyuan | Dingyuan | DY | 1 | 938 | 12.98 | 62 | |

Chuzhou | Chuzhou | CZ | 1 | 3938 | 118.48 | 59 | |

Nanjing | Nanjing South | NJS | 3 | 8187.8 | 882.075 | 65 | |

Zhenjiang | Zhenjiang South | ZJS | 2 | 3171 | 325.24 | 32 | |

Danyang | Danyang North | DYN | 1 | 977 | 102.0 | 33 | |

Changzhou | Changzhou North | CZN | 1 | 4696.4 | 475.0 | 57 | |

Wuxi | Wuxi East | WXE | 2 | 6500 | 820.53 | 27 | |

Suzhou | Suzhou North | SZN | 2 | 10466 | 1376.09 | 31 | |

Kunshan | Kunshan South | KSS | 1 | 1650.3 | 300.1 | 50 | |

Shanghai | Shanghai Hongqiao | SHHQ | 4 | 24260 | 2356.8 | 0 |

The properties of 115 trains are collected, including the passenger demand, train OD, number of trains, number of stop schemes, total number of stops, and average number of skip stops for trains, which is the average equilibrium index of the trains, as shown in Table

Operation information for the Beijing-Shanghai high-speed railway.

No. | OD | Number of trains | Number of stop scheme | Total number of stops | Average number of skip stops |
---|---|---|---|---|---|

1 | BJS-SHHQ | 64 | 56 | 576 | 4.3 |

2 | BJS-XZE | 2 | 2 | 21 | 2.7 |

3 | BJS-JNW | 4 | 3 | 21 | 3.9 |

4 | BJS-NJS | 6 | 5 | 42 | 2.8 |

5 | BJS-ZZ | 1 | 1 | 11 | 7.1 |

6 | TJW-SHHQ | 7 | 7 | 66 | 2.8 |

7 | JNW-SHHQ | 4 | 4 | 48 | 2.4 |

Total | 115 | 78 | 785 | — |

To obtain a better station-level input, the fuzzy

Three optimal clusters are found, which are plotted in Figure

Clustering results for the stations.

Cluster C contains several stations without additional details in this step. To further analyse these samples, we rerun the clustering models for cluster C samples only to further explore the station characteristics and station-level structure. The results are shown in Figure

Clustering results for cluster C.

As shown in Figure

Updated station levels.

Station level | Stations |
---|---|

1 | Beijing South, Shanghai Hongqiao |

2 | Tianjin West (south), Jinan West, Nanjing South, Suzhou North |

3 | Xuzhou East |

4 | Langfang, Cangzhou West, Dezhou East, Taian, Suzhou East, Bengbu South, Zhenjiang South, Changzhou North, Wuxi East |

5 | Qufu East, Tengzhou East, Zhaozhuang, Dingyuan, Chuzhou, Danyang North, Kunshan South |

With the input railway data and station level, the RMDP-based method is applied to solve the stop plan problem. The optimal train stop plan is displayed in Figure

Optimal stop plan results.

Comparing the solution with the original stop plan shown in Table

Stop plan comparison.

No. | OD | Number of stop schemes | Total number of stops | ||
---|---|---|---|---|---|

Original | Optimal | Original | Optimal | ||

1 | BJS-SHHQ | 56 | 52 | 576 | 567 |

2 | BJS-XZE | 2 | 2 | 21 | 20 |

3 | BJS-JNW | 3 | 4 | 21 | 20 |

4 | BJS-NJS | 5 | 5 | 42 | 48 |

5 | BJS-ZZ | 1 | 1 | 9 | 9 |

6 | TJW-SHHQ | 7 | 7 | 66 | 66 |

7 | JNW-SHHQ | 4 | 4 | 48 | 41 |

Total | 78 | 75 | 785 | 771 |

The equilibrium index distributions for all trains in the two stop plans are shown in Figure

Average equilibrium of trains for original and optimal stop plans.

With the given passenger flow demand, the equilibrium index of passenger flow can be obtained for both stop plans shown in Figure

Occupancy rates of each section for original and new approaches.

To further analyse the relationship between the number of stops (ST) under the proposed approach and its related features (including population, passenger demand, and GDP), the standard distributions of the features in each station are plotted in Figure

Feature distributions for the optimal results.

This study applied clustering analysis and MDP machine learning techniques to analyse the significant features related to the stop plan and proposed a data-driven optimal framework for a train stop plan based on real-world train operational data. Service-node features are adopted as important characteristic station elements. To make the qualitative features more effective, a clustering analysis technique was used to develop a quantitative analysis that can be applied directly to the optimal model. Different average feature values of clusters correspond to different station levels. Accordingly, the stop plan was optimised by continuing epochs that were described with an RMDP model that considered some constraints related to stop planning with the known passenger demand and the per-obtained total number of trains for each OD. A restrained MDP-based stop plan model was proposed to improve the stop plan using the relative value iteration algorithm. A case study was performed on the Beijing-Shanghai high-speed railway line. The computational results revealed that the optimal train stop plan solution is better than the original plan in terms of operation cost control, service quality improvement, and passenger demand adaptiveness. Furthermore, the proposed approach can efficiently solve the stop plan problem with a simpler solution algorithm.

In future research, we will use this approach as the foundation to adjust the stop plan and combine the stop schedule to explore the interaction relationship and the train time tabling problem. We also intend to investigate different machine learning methods, to achieve solution improvements and faster computation.

The data used to support the findings of this study are available from the corresponding author or the first author upon request.

The authors declare that they have no conflicts of interest regarding the publication of this paper.

This work was supported by the National Key Research and Development Plan (grant no. 2017YFB1200701) and the National Natural Science Foundation of China (grant no. U1834209).