An Affinity Propagation-Based Clustering Method for the Temporal Dynamics Management of High-Speed Railway Passenger Demand

School of Rail Transportation, Wuyi University, Jiangmen 529020, China China Railway Eryuan Engineering Group Co. Ltd, Chengdu 610031, Sichuan, China School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 610031, Sichuan, China National United Engineering Laboratory of Integrated and Intelligent Transportation, Chengdu 610031, Sichuan, China School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China


Introduction
With the extension and integration of China's high-speed railway network, it is becoming the preferred mode of travel. One of the most important tasks for high-speed railway passenger transport management is to adjust the line plan according to the characteristics of the fluctuations of the passenger flow over a year, so that the line plan is adapted to the passenger demand. erefore, it is obvious that the annual passenger flow interval classification serves as the foundation for the adjustment of the line plan [1,2]. e usual method used by the railway bureau is to classify annual passenger flow intervals according to the subjective experience of the engineering and technical personnel. However, the quality of this classification will largely depend on the experience of the personnel, and unreasonable classification results are likely to be generated since the characteristics of the seasonal fluctuation of passenger demand are usually not fully taken into consideration [3].
ere have not been any directly relevant studies on the high-speed railway passenger flow interval classification problem. However, it is very similar to the time-of-day (TOD) interval identification problem when developing traffic signal timing plans. In the TOD interval identification problem, a full day is treated as a cycle that contains several time points, and time points with similar traffic attributes are classified into the same group, as shown in Figure 1. By using the TOD interval identification, a day can be divided into several intervals and the traffic signal timing plans can be made on the basis of the dynamics of the traffic flow. e difference between this strategy and the multiperiod control strategy of road intersections is that railway passenger transportation takes an annual cycle, divides the year into several periods with similar passenger flow according to the change of daily passenger flow, and then divides according to the average passenger flow of each period to prepare the train operation plans. In the actual operation of the railway, a manual decisionmaking method is adopted to solve the basic problem of passenger flow time division. e results of railway passenger flow time division in current research are relatively rare. e main contributions of this article are as follows: (1) e paper takes a step forward in the production of an annual train operation plan. is method accounts for the dynamics and regularity of the annual passenger flow.
(2) A new affinity propagation (AP)-based classification method is proposed, which is applied to the time division of high-speed railway passenger flow. e rest of this paper is organized as follows. In Section 3, we explain why passenger flow intervals need to be classified by analyzing the temporal dynamics of the passenger demand of a high-speed railway. Section 5 presents a comparative analysis of the adaptability evaluation indexes of the line plan based on the passenger flow interval classification results obtained by applying our proposed APbased clustering method with the actual results obtained by the high-speed railway passenger transport management department.

Literature Review
Chinese railway transportation organization implements the basic principle of driving according to flow. erefore, studying the characteristics and laws of railway passenger flow is of great significance for improving the adaptability of passenger flow of transportation organizations and the economic benefits of the railway. At present, there are many studies on laws of passenger flow in railway passenger demand management. From the perspective of research objects, the laws of flow can be roughly divided into the following four types: (1) Road Network Passenger Flow. is type of research mainly focuses on the trend of high-speed rail passenger flow and is used to solve the social and economic problems of line network laying, price positioning, and high-speed rail development direction.
(2) e Overall Passenger Flow of the Line. is type of research mainly focuses on the spatial characteristics of the passenger flow and is used to analyze the number of train pairs and line utilization on a certain line. (3) Nodal Passenger Flow. is type of research is mainly aimed at one or more nodes on a certain railway line. It is used to analyze the importance of nodes, railway stop schedule plan, node selection in train changing line, station node reconstruction, and expansion.
(4) OD Passenger Flow. is type of research includes OD passenger flow spatial characteristics and OD passenger flow time period characteristics. e former mainly focuses on the passenger flow conditions at the starting and ending points, the travel distance of the passenger flow, and the direction of passenger flow. e latter focuses on the preference and law of passenger flow time. e characteristics of passenger flow period can analyze the passenger flow more accurately and provide an important data basis for the preparation of a train operation plan. e usual method to determine the TOD intervals is to draw the cumulative traffic flow count curve of a typical intersection for a day, and then those time points with significant fluctuation are chosen manually as the split points between two adjacent intervals; thus, the TOD is classified into intervals. Considering the nature of the traffic flow fluctuation, some scholars proposed theoretical methods to determine the TOD intervals using practical examples to demonstrate their methods. e proposed methods include heuristic search methods and clusteringbased analysis methods [4,5]. e heuristic search method defines the TOD interval identification problem as a mathematical optimization problem. Park et al. [6] introduced heuristic algorithms to determine the optimal TOD break points for traffic signal timing plans; this overcomes the disadvantages of clustering algorithms, which often generate infeasible TOD intervals. Abbas and Sharma [7] proposed a multiobjective evolutionary algorithm to solve the TOD interval identification problem. However, the optimal number of break points cannot be determined due to the randomness of evolutionary algorithms. Park and Lee [8] designed a greedy search algorithm to obtain the optimal TOD break points that needs fewer evaluations than the genetic algorithm and is robust to various demand fluctuations. Lee et al. [9] introduced the transition cost between break points into the genetic algorithm, and the obtained results were better than those from the greedy search algorithm reported by Park and Lee [8].
Another method for determining the TOD intervals is the clustering-based analysis method. Hauser and Scherer [10] proposed the cluster analysis approach based on the concept of a high-resolution system state to deal with the TOD interval automatic identification problem. However, nonadjacent time points will be clustered into the same interval. Wang et al. [11] designed a nonhierarchical clustering algorithm, namely, the K-means method, to determine the TOD break points. However, the number of clusters needs to be specified manually in advance. Ratrout and Nedal [12] first combined the microsimulation method and the K-means algorithm to obtain the optimal TOD break points, and then an improved subtractive clusteringbased K-means technique was proposed. Finally, it was proven that those two approaches can generate similar results. Liu et al. [13] considered the difference of traffic flow, averaged the traffic volume at the intersection in hours, and used the Webster method to calculate the signal duration to determine the discrimination threshold of the control period division. Shen et al. [14] proposed an improved K-means clustering algorithm which updated cluster center initialization and rules and applied it to the division of bus operation time. Zhao et al. [15] used the classical spectral clustering algorithm (Ng-Jordan-Weiss, NJW) to divide the time period and Synchro and SIMTraffic software to establish the optimal signal timing plan and simulation evaluation of the time period division results. Yao et al. [16] used the Ward minimum variance method to cluster the historical traffic flow data and the improved criterion of cube group as the cluster termination condition to determine the optimal number of plans and the best break points for multiplan control of traffic signals in TOD mode. Song et al. [17] introduced the dissimilarity matrix to determine the number of clusters, and the break points were determined based on nonintrusive data collection techniques. In addition, better signal timing plans can be made by that method. Moreira et al. [18] drew the annual vehicle travel time curve of the bus line, used dynamic time warping to evaluate the similarity of the travel time during different days, and clustered the date according to the travel time change curve to make the annual time period division. Li et al. [19] introduced an improved strategy of dynamic recursion on the basis of the ordered sample clustering algorithm to realize the division of road intersection signal time period. Song et al. [17] considered the coordination mechanism of passenger demand and travel time for the time division plan and proposed a bus operation time division model with the goal of minimizing the operating time cost of the bus fleet throughout the day. Bie et al. [20] gave a method for determining the number of transition periods under different timing schemes, established a calculation model for the total delay time of vehicles in adjacent intervals, and used the difference in total vehicle delay time as a threshold of the discrimination index to determine whether adjacent intervals are merged.
Previous studies related to TOD intervals have focused on road interactions, and the time-of-day is divided into N time points. en, adjacent time points with similar traffic volumes are clustered into the same traffic interval.
However, there are great differences in the classification of operation intervals between high-speed railway passenger flow and road intersection intervals, which are reflected in the following aspects: (1) the departure times of railway passengers mainly depend on the train schedule, such that it is difficult to reflect the regularity of passenger flow throughout the day, and thus, the passenger flow intervals can only be classified with the length of study period equal to one year; (2) a sharply increase or decrease could be witnessed in the amount of passenger flow during the whole year, which is due to the existence of festivals, such as the Spring Festival and Tomb-Sweeping Day. As a result, the special festivals will be listed as independent time periods in the traditional clustering algorithm, which is inconsistent with the practical railway operation condition. erefore, the classification approaches of road intersection intervals are not applicable to high-speed railway passenger demand classification.
By considering the characteristics of the fluctuation of the high-speed railway passenger flow, this paper proposes an AP-based clustering method to classify the high-speed railway passenger flow intervals accurately and automatically. First, time points in one year are taken as samples, and the number of passengers dispatched at each station along the high-speed railway line serves as variables that describe the samples. Second, the time points are clustered into groups to form the passenger flow intervals by using the APbased clustering method. Finally, measures of effectiveness are employed to evaluate the clustering results so that the optimal number of clusters can be determined.

Temporal Dynamics of the Passenger
Demand on a High-Speed Railway e managers of a high-speed passenger railway need to change the line plan in accordance with the seasonal fluctuations in the passenger flow. However, the adjustment of a line plan is a systematic, complex, and massive task because the high-speed railway lines are too busy, and line plan is only adjusted a few times each year. In this situation, passenger flow intervals need to be classified reasonably so that the number of adjustments of the line plan can be determined accordingly, and the line plan can be always adaptable to the temporal dynamics of the passenger demand on the high-speed railway. e set of time points in a year is denoted by T � t 1 , t 2 , . . . , t n , and the number of passengers dispatched at each station of the high-speed railway line at time point t is represented by is the descriptive attribute. If the adjacent time points in the set T with similar attributes X(t) are clustered into p disjoint nonempty subsets T 1 , T 2 , . . . , T p , then the time in a year is transformed into a finite number of continuous intervals; thus, the passenger flow intervals are classified. In this way, the passenger flow interval classification problem is indeed a clustering analysis problem, as shown in Figure 2.
According to the pattern of the fluctuation in the passenger flow, the following two key problems need to be dealt with while solving the high-speed railway passenger flow interval classification problem.
(1) e number of passenger flow intervals cannot be too small or too large. In the former situation, the temporal dynamics of passenger flow cannot be reflected properly. However, in the latter situation, there will be too much difficulty in implementing and adjusting the line plan. (2) e time span of passenger flow intervals cannot be too short or too long. e former will make it difficult implement or adjust the line plan, and in the latter situation, the temporal dynamics of the line plan cannot be reflected properly.

Affinity Propagation Method.
AP is a state-of-the-art clustering algorithm developed by Brendan J. Frey and Delbert Dueck. It is based on the concept of "message passing" between data points. e AP algorithm has been chosen to partition the passenger flow period of this highspeed railway because it not only has a faster convergence speed when dealing with large-scale and complex data but can also avoid the clustering result being limited by the choice of the initial class representative point. e AP algorithm performs clustering on a similarity matrix composed of sample data points. Like other clustering algorithms, its goal is to minimize the distance between each data point and its class representative point in the partition category to achieve a partition [6,21]. e basic principle of this algorithm is introduced as follows: (1) All the N samples in the dataset are regarded as candidate class representatives. e similarity between any two samples x i and x k is established, namely, by the attraction degree of each sample with other samples, and stored in the N × N-dimensional similarity matrix.
(2) Definition. s(i, k) indicates the similarity between samples x k and x i , that is, the quantization level of the suitability of sample x k for being a class representative of x i . When the algorithm is initialized, all the samples are assumed to have the same probability to be selected as a class representative, namely, assuming all s(k, k) have the same value p. e similarity between two points is calculated by (3) Two important information parameters, the credibility matrix r and the availability matrix a, are employed by the algorithm: r(i, k) is a credibility matrix that points from x i to x k , which indicates the degree of representativeness of sample x k , that is, the degree to which it is suitable as a class representative of sample x i ; a(i, k) is an availability matrix that points from x i to x k , which indicates the appropriateness of choosing x k as the class representative of sample x i . For any sample x i , the sum of its credibility and availability with all other samples is calculated, and the sample x k with the largest sum is selected as the class representative. e alternating update of the above two information quantities is the iterative process of the AP algorithm. e update formula for the credibility matrix r(i, k) is e update formula for the availability matrix a(i, k) is

Evaluation of the Validity of the Clustering.
e validity of the clustering is used to quantify and evaluate the quality of the clustering results and determine the optimal partition of the dataset [22]. Clustering validity indexes are adopted to evaluate which result generated by the clustering algorithm is optimal and the number of clusters corresponding to the optimal result is taken as the optimal clustering number. e output of the AP algorithm is a series of clustering results that contain different numbers of clusters; hence, the effectiveness of these clustering results needs to be evaluated. In this paper, the effectiveness indexes for evaluating the optimal clustering number include Calinski-Harabasz, Hartigan, and In-Group Proportion.

e Calinski-Harabasz Index.
e Calinski-Harabasz (CH) index is a measure of the intraclass dispersion matrix and the interclass dispersion matrix for all samples, and the number of classes corresponding to the maximum value is taken as the optimal clustering number.
Here, k is the number of clusters, tr B(k) is the trace of the interclass dispersion matrix, and tr W(k) is the trace of the intraclass dispersion matrix.

e Hartigan Index.
e Hartigan index [12] can be used in cases where the number of clusters is 1. e minimum number of classes with Hart ≤10 is the optimal clustering number.

In-Group Proportion
Index. e In-Group Proportion (IGP) index [13] is used to measure whether the samples nearest to each sample in a class are in the same category. A cluster with a larger average IGP index has a better clustering quality, and the number of clusters corresponding to the maximum value is the optimal clustering number.
Here, u is the clustering standard, Class(j) is the standard of sample j, j N is the sample nearest to sample j, and # denotes the cardinality of a set.

Assessment of the Adaptation to Passenger Demand.
First, we establish a train operation plan and simulate a passenger flow distribution according to the average demand of passenger flow in each period after the passenger flow period was partitioned. e adaptability between the passenger demand and the train operation plan during each period is quantitatively evaluated and summarized using three indexes, namely, the satisfaction rate of passenger demand, the average attendance rate of the train, and the direct rate of passenger flow. e indexes are calculated as follows.

Passenger Demand Satisfaction
Rate. Passenger demand satisfaction rate is mainly reflected in the passenger transport capacity and degree of satisfaction of the passenger demand provided by the train operation plan between each pair of passenger flow ODs of the high-speed railway and the related railway network. It can be expressed as the ratio of the passenger traffic volume to the total passenger demand that transport service is effectively obtained, which is constrained by the capacity of the available resources, especially train staff, under the conditions of the established train operation plan. e formula is as follows: where q w indicates the total passenger flow between passenger flow OD pair w and q w ′ indicates the total passenger volume transported by high-speed railway between passenger flow OD pair w.

Seat Occupancy Rate.
Seat occupancy rate refers to the weighted ratio of the amount of passenger flow a train carries in its operating section to the total number of seats provided by the train, which is used to reflect the selection results of passengers in different passenger flow OD pairs for various types of high-speed trains. e average attendance rate of a train means the average rate of all train attendance rates in the evaluation range. e formula is as follows: where q h ij is the passenger flow carried by train h in the interval (i, j), A h is the number of staff on train h, and E h is the number of segments operated by train h.

4.3.3.
e Direct Rate of Passenger Flow. e demand structure of passenger flow is made up of different demand directions, and each demand direction has a direct or transfer plan to reach the destination. e direct rate of passenger flow is expressed by the ratio of passenger flow directly to their destination without transfer to the total passenger flow in this direction between each point pair of passenger demand under the conditions of the established train operation plan and the structure of the passenger demand. It is calculated by where q d w is the number of passengers who can reach their destination directly without the need of a transfer between passenger flow OD pair w and q e w is the number of passengers who have reached the destination by e transfers between passenger flow OD pair w.

Partitioning the High-Speed Railway Passenger Flow
Periods.
e partitioning of the high-speed railway passenger flow needs to reflect the variation of passenger demand with the seasons and holidays. For each different Journal of Advanced Transportation direction of high-speed railway passenger flow, a railway station is the basis of the departure and another is the basis for the arrival for the passenger flow to realize a displacement. erefore, representative large passenger stations on the high-speed railway lines have been the subject of research. According to the statistics on passenger volume during the year, time points with similar passenger flow statistics are grouped into the same category, and adjacent time points in the same category are seen as the same passenger flow period. Based on the basic principles of the above algorithm, the basic steps of partitioning the highspeed railway passenger flow into periods are as follows: Step 1. Collect the information about the high-speed railway lines, stations, and volume of passengers sent at each time point from the stations along the railway lines. Set the maximum number of iterations of the algorithm to be N max . To eliminate the differences of scales between variables, standardization was performed on each variable as follows: where Z − score is the standardized value, x is the daily volume of passengers sent between stations, x is the average number of passengers sent between all stations during the year, and σ is the standard deviation of the total number of passengers sent between all stations during the year. e sample data vector in the cluster is . . . . . .
where X(t) is the passenger flow state vector of highspeed railway at time point t (day t) and x i,j (t) is the number of passengers sent from the i-th node to the j-th node of the high-speed railway at the time point t.
Step 2. Initialization: First, set the initial values of the credibility matrix r(i, k) and the availability matrix a(i, k) to 0. en, calculate the sample similarity matrix s(i, k), using the Euclidean distance as the measure by formula (12). Set the diagonal element s(k, k) to have the same median attractiveness value.
Here, N is the number of samples.
Step 3. Iteration: (1)Update the usability and credibility using formulas (2) and (3). (2) Set damping factor to eliminate the digital oscillations in the iteration.
Here, r new (i, k) and r old (i, k) are the credibility matrices obtained from the previous update and this update, a new (i, k) and a old (i, k) are the availability matrices obtained from the previous update and this update, and λ ∈ (0, 1) is the damping factor, whose value is set to 0.9. Step 4. Output the result: Judge whether the iterative process has reached the maximum number of iterations that have been set, that is, n ≤ N max . If it is reached, then terminate the algorithm and output the partition results of all the categories of the time points; otherwise, return to step 3. Step

Numerical Experiments
To test the effectiveness of our proposed AP-based clustering method, the Zhengzhou-Xi'an high-speed railway was taken as an example. It contains 10 high-speed railway stations and is approximately 505 kilometers long. ere are 9 stations along the Zhengzhou-Xi'an high-speed railway, which are e collected data are further processed according to equations (10) and (11) so that the passenger flow intervals of the Zhengzhou-Xi'an high-speed railway can be classified.

Passenger Flow Interval Classification Results.
e AP algorithm is first used to cluster the samples, and three validity assessment indexes, namely, Calinski-Harabasz, Hartigan, and In-Group Proportion, are used to evaluate the clustering results, as shown in Figure 4. Figure 5 demonstrates the optimal number of clusters for the Zhengzhou-Xi'an high-speed railway line, which is based on the data collected from 2014 to 2015. Among them, the horizontal axis represents the number of days, that is, the sample in the annual passenger flow period division problem, which has a total of 365 days. e vertical axis represents the categories formed by the passenger flow period division, which contains a total of 5 categories. As shown in Figure 5(a), the ordinate value corresponding to the abscissa 1-24 takes the value 1; that is, the first day to the 24th day of 2014 are classified as category 1; the ordinate value corresponding to the abscissa 25-29 takes the value 2; that is, the 25th to 29th days of 2014 are classified into category 2, and the rest can be deduced by analogy.
By iterating over all the 5 clusters and comparing every two samples in the same cluster, the discontinuous intervals in the same cluster are split. e resulting classification of passenger flow intervals for the Zhengzhou-Xi'an highspeed railway line from the year 2014 to 2015 is presented in Table 1. Table 1 shows that the 365 days of the year 2014 or 2015 are divided into 13 intervals for the Zhengzhou-Xi'an highspeed railway line, and the lengths of the time spans for interval 3, interval 6, interval 7, interval 8, and interval 12 are exactly the same in 2014 and 2015. However, the lengths of the time spans for other intervals are different because of the Chinese Spring Festival.
e Spring Festival began on January 31st for the year of 2014, which is also the 31st day in the year, but in 2015, it started on February 19th, which is the 50th day in the year. In addition, the beginning of the time span for interval 2 is always 7 days before the Spring Festival, and the seasonal characteristics of the other intervals are obvious. e seasonal characteristics of all 13 intervals are summarized below. e time span of interval 1 corresponding to the steady period of passenger flow stands between the Chinese New Year and the Chinese Spring Festival. e peak period of passenger flow during the Chinese Spring Festival is covered by the time span of interval 2, interval 3, and interval 4. e time span of interval 5 serves as the steady period of passenger flow between the Spring Festival and the Ching Ming Festival. e peak period of passenger flow during the Ching Ming Festival is covered by the time span of interval 6. Interval 7 has its time span corresponding to the steady period of passenger between the Ching Ming Festival and Labor Day. e time span of interval 8 covers the peak period of passenger flow during Labor Day. e steady period of passenger flow between Labor Day and the summer vacation is the time span of interval 9. e time span of interval 10 serves as the peak period of passenger flow during the summer vacation. e steady period of passenger flow between the summer vacation and the National Day is covered by the interval 11. e time span of interval 12 covers the peak period of passenger flow during the National Day. e time span of interval 13 serves as the steady period of passenger flow standing between the National Day and the Chinese New Year.

Correction of the Results of the Classification of the Passenger Flow Intervals.
e time spans of interval 3, interval 6, and interval 8 are only one day long each, and it is inconvenient and difficult for the high-speed railway passenger transport management department to adjust the line plan. erefore, the three intervals whose lengths are less than 7 days are merged with their adjacent intervals according to field work experience. e corrected classification of the passenger flow intervals for the Zhengzhou-Xi'an highspeed railway is listed in Table 2.
e classification of the passenger flow intervals in Table 2  e results of classifying the passenger flow intervals in Table 2 can be used as the foundation for the evaluation and adjustment of the line plan, and the adaptability of the line plan can be evaluated according to the predicted passenger demand in each interval. In addition, the line plan needs to be adjusted if the evaluation results are not ideal. en, the passenger demand matching rate (including passenger demand satisfaction rate, the average train occupancy rate, and the direct passenger flow rate) of each time period is calculated based on formulas (7) to (9), as shown in Figure 6. On this basis, the matching rate of each passenger flow period is weighted and summarized to get the full-year passenger demand overall matching rate. e calculated matching rate is compared with the actual statistics of the Zhengzhou-Xi'an high-speed railway in 2014 and 2015, which is shown in Table 3. Table 3 demonstrates that the passenger flow interval classification results obtained by applying our proposed APbased clustering method can achieve better adaptability    0  G2002  G652  G820  G26  G672  G98  G656  G2006  G824  G658  G668  G660  G834  G662  G2008  G838  G828  G842  G88  G664  G858  G2004  G2010  G642  G666  G2012  G674

Discussion
In the actual railway passenger transportation, the passenger transportation management department will adjust the train operation plan according to the characteristics of the annual passenger flow, which is specifically reflected in the "daily operation plan," "Spring Festival transportation operation plan," and "summer operation plan." e implemented essence of the above operation plan is the adjustment of the train operation plan based on the dynamic changes of the annual passenger flow, and the basis of the adjustment of the train operation plan lies in the reasonable division of the annual passenger flow period. erefore, the study of the annual passenger flow period division based on AP has a strong social background and practical significance.
Based on the AP clustering algorithm proposed in this paper, the entire year of the high-speed railway is divided into 7 time periods which can be described as "flat peak period," "spring festival period," "flat peak period," "summer transport period," "flat peak period," "eleventh time period," and "flat peak period" according to time characteristics. On this basis, the train operation plan is prepared according to the average passenger flow of each time period, which can achieve the effect of adapting to the demand of passenger flow to the greatest extent.

Conclusions and Future Work
Our paper focuses on the problem of the classification of the passenger flow intervals for a high-speed railway so that the line plan of the high-speed railway can be adapted to the passenger demand. A novel AP-based clustering method is introduced to tackle the passenger flow interval classification problem using the data collected, concerning the number of passengers dispatched at each station along the high-speed railway line for about two or three years. In addition, three validity indexes, Calinski-Harabasz, Hartigan, and In-Group Proportion, are used to decide on the best number of clusters. Finally, field work experience is used to adjust the clustering results of our proposed method slightly to be more reasonable.
e Zhengzhou-Xi'an high-speed railway has been taken as an example to illustrate the effectiveness of the proposed method. We first collect the number of passengers dispatched at four high-speed railway stations from the year of 2014 to 2015, and then our proposed method classifies the 365 days in a whole year into 13 continuous intervals. Finally, the 13 intervals are further integrated into 7 intervals according to the field work experience. Moreover, we compared our classification results with those used by the railway bureau, and the comparison shows that our method can improve the passenger demand satisfaction rate, average train seat occupation rate, and the direct passenger flow rate significantly without changing the number of adjustments to the line plan. In addition, our method can be performed in a fast computer-aided way, which is more objective and accurate.
is paper has proposed an effective AP-based clustering method to classify high-speed railway passenger flow intervals. e limitation of the research in this paper is that the proposed method is limited to the division of passenger flow time periods for a certain railway line. However, China's high-speed railway network is complicated, and different lines connect different regions with different economic development statuses and passenger flow distribution characteristics. As a result, the time span between the peak period and the flat peak period of each line in the same year is different. erefore, how to divide the time period of highspeed railway passenger flow under the networked condition is a future research direction in this field.
Data Availability e original passenger flow data used to support the findings of this study are currently under embargo. Requests for data, 12 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.