Estimation of Time-Varying Passenger Demand for High Speed Rail System

Passenger demand plays an important role in railway operation and organization, and this paper aims to estimate passenger timevarying demand by simulating the ticket-booking process for High Speed Rail (HSR) system. The ticket-booking process of each OD pair can be partition into discrete booking phases by the times when the tickets of any itinerary had sold out.The ticket booking volume of each itinerary is reversely assigned to its corresponding expected departure intervals to obtain the time-varying demand in each booking phase using the rooftop model, and the total time-varying demand are estimated by summing the time-varying demand distributions in all booking phases. Only with the data about the itinerary flow, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for eachODpair. Based on the precedence relationships of itineraries, two typical situations are proposed, in which the Single Booking Phase Reverse Assignment (SBPRA) algorithm and the Multiple Booking Phases Reverse Assignment (MBPRA) algorithm are proposed to estimate the time-varying demand respectively. Case analysis on OD pair Beijing-Shanghai are presented, and the validity analysis demonstrates that the error rates of SBPRA algorithm and MBPRA algorithm are 8.64% and 6.37%, respectively.


Introduction
Passenger demand plays an important role in railway operation and organization.The conventional methods of line planning and scheduling generally aim to meet total passenger demand volumes of each OD pair [1][2][3][4].However, High Speed Rail (HSR) is characterized as train operations with rapid speed and high frequency, and the transportation capacity of HSR is much larger than ordinary speed railway.For example, the distance from Beijing to Shanghai is about 1300 kilometers, and the service frequency between them in ordinary speed railway (speed≤160km/h) is 10 times a day in 2006.With the construction and operation of HSR (250km/h≤speed≤350km/h), the service frequency from Beijing to Shanghai is 42 times a day in 2018 (https://www.12306.cn).The HSR system can not only meet total passenger demand volumes with large transportation capacity, but also meet expected departure/arrival times of passengers with the high frequency of train operations.For a given OD pair, the demand rates with different expected departure/arrival times may differentiate within a day, which can be defined as the time-varying demand.With the improvement of HSR system, more and more studies focus on meeting the time-varying demand in the line planning and scheduling [5][6][7][8].As for the above studies, the timevarying demand was adopted as input data, how to obtain that data has become an important common issue.However, little research has focused on this problem.This paper aims to fill in the research gap of the time-varying demand estimation of HSR.This paper focuses on the time-vary demand over the expected departure time.However, one may estimate the demand against the expected arrival time.If train travel time is constant (without delays and uncertainties), departurebased demand can be converted into arrival-based demand.However, if delays and uncertainties are to be considered, these two might not be simply converted to each other.We leave this for future research.
At present, there are different organization modes adopted by HSR among different countries, which cause distinctive estimation problems in terms of the time-varying 2 Complexity demand.In some countries, such as China, passengers must book in advance, and sit according to their ticket number.Passenger flow in the train therefore is equal to the ticketing volume of this train.We are able to get all passenger flows of a OD pair from the Railway Ticketing System (RTS), and the transport volume of this OD pair is the sum of all passenger flows.In China, the ticket fare is fixed throughout the pre-sale period and will not be discounted due to multiple purchase of a single passenger or group purchase.There are also some countries, such as Japan, where HSR tickets have two types: free/non-reserved seats and reserved seats.Passengers who hold a free/non-reserved seat ticket can get on any train during the valid time.Passenger flow in each train therefore cannot be calculated by the ticketing data.In these areas, passengers who book tickets in advance, purchase round-trip tickets or group tickets may enjoy discounts.Hence, ticket discounts will affect the choice of HSR passengers.In this paper, we focus on solving estimation problem of HSR timevarying demand in situations like the case in China: all passengers must book in advance and sit according to their seat numbers marked on the tickets, i.e., the flow of each itinerary can be obtained from its corresponding ticketing volume; and the ticket fare for each itinerary during the whole pre-sale time window is fixed (booking time independent fare).
For HSR system, the time-varying demand of each OD pair has two features: total demand volume in one day and its corresponding time-varying distribution in the operation period of this day.At present, China has large transportation capacity and high frequency of HSR system.HSR passengers do not need to shift to other transport modes (except for some important holidays) due to insufficient capacity.Therefore, in our estimation problem, we assume that HSR system has enough capacity over the service time window to serve all passengers for each OD pair, and the flow shifting between transport modes is not considered.Then, the total demand volume of each OD can be obtained from Railway Ticketing System (RTS).In China, HSR passengers must book tickets in advance, and then, the real departure time of each passenger and the ticketing volume of each itinerary can be obtained from RTS.However, the real departure time of a passenger may deviate from his or her expected departure time.For instance, if there was no departing train at the expected departure time, or the ticket of that train had sold out, this passenger would have to adjust his or her departure time to another train.Hence, all passenger flows at their real departure time from the RTS cannot be regarded as the timevarying distribution directly.In this paper, we managed to tackle the issue that given the total demand volume of the OD pair and the ticketing volume of each itinerary, how to reverse the discrete ticketing volume to continuous timevarying distribution.
At present, there are little studies of estimation of HSR passenger time-varying demand.In the past few decades, previous studies on the time-varying demand estimation mainly focus on airline demand forecast, traffic dynamic OD estimation and public transit OD estimation.
In the aviation industry, accurate forecasts of passenger demand are the heart of a successful revenue management system [9].The objective of revenue management or yield management is "selling the right seats to the right customers at the right prices" [10].The forecasts are usually based on historical booking data.Thus, one of the main objectives of airline booking data analytics is to estimate unconstrained demand for each fare class using censored historical booking data [11].The booking data are called censored because after a booking limit is reached, further booking attempts are rejected and not recorded by the system [9,12].Weatherford and Pölt [9], McGill [13], Mukhopadhyay et al., [14] and Ratliff et al., [15] have developed various remedial approaches for estimating unconstrained demand.Additionally, low-cost airlines irregularly launch ticket promotions, where fares may differ by day of the week and departure dates.The timing for purchasing air ticket is thus closely associated with fares.Passengers often do not buy airline tickets immediately when they determine their itinerary, and may choose to wait for fare promotions before marking reservations [16].Therefore, Wen and Chen [16] and Chiou and Liu [17,18] study the advance purchase behavior of air passengers using booking data.The result from Wen and Chen [16] indicated that lower fares increase the number of bookings and heterogeneous preferences in booking timing are present.Some travelers tend to book flights earlier than the other groups: these are the price-sensitive customers.The result from Chiou and Liu [18] indicates that advance purchase timing is associated with airfare, uncertainty of airfare, time of day, days of the week, months of the year and consecutive holidays.Diego [19] uses an original dataset with posted prices and sales to estimate the dynamic demand for airlines.They find that consumers become more price sensitive as time to departure nears which is consistent with having lower valuations and the number of active consumers increases closer to departure.However, HSR time-varying demand estimation problem is different from airline demand forecast.The ticket fare in airline changes dynamically during the pre-sale period, and the change of the ticket fare is associated with demand and advance purchase timing of passengers.The main purpose of demand forecasting in airline demand forecast is to obtain the demand corresponding to different fares during the presale period in the segmentation market.However, for the estimation problem of time-varying demand in HSR system, the ticket fare is fixed throughout the pre-sale period and the effect of ticket fare changes on demand does not need to be considered.Passengers usually purchase their tickets as early as possible when they determine their itinerary, in order to purchase the tickets as close as possible to their expected departure time.Under the circumstance that HSR has enough capacity over the service time window to serve all passengers for each OD pair, we want to estimate the time-varying demand distribution in the operation period of each OD pair.
For traffic dynamic OD estimation problems, they mainly use some observation information, including link volumes, traffic counts and various forms of exogenous information, either in the forms of a priori knowledge or structural assumptions, to solve the estimation problems.A common approach is using autoregressive process to describe the dynamic process for the evolution of demand [20,21].Along this line, instead of an autoregressive process, Zhou and Mahmassani [22] developed a polynomial trend filter to capture the possible structural deviation in real-time demand.To improve unknown/equations ratio, Marzano and Papola [23] and Cascetta et al. [24] proposed a "quasi-dynamic" framework estimator.Djukic et al. [25] used principal component analysis to reduce the dimensionality of the estimation problem.In addition, for the dynamic demand estimation problem, not only within-day dynamic demand estimation, day-to-day dynamics has received much attention as well.For instance, Zhou and Mahmassani [22] modelled explicitly a day-to-day evolution process using a Kalman filter.Hazelton [26] used statistical estimation theory to estimate day-today OD matrices.Shao et al. [27] estimated the mean and covariance of peak hour OD demands from day-to-day traffic counts.However, HSR time-varying demand estimation problem is different from the above problem.Firstly, HSR trains operate according to timetable, and then the impact of timetable should be taken into consideration.Secondly, HSR trains operate during time-of-day periods; therefore, it is only necessary to analyze the within-day time-varying demand.
In Transit network, Wang et al. [28] and Chan et al. [29] used the boarding counts at every station from the Automatic Fare Collection system to generate the estimation problems, and some researchers use the boarding and alighting data from the Automatic Passenger Count systems and base on some assumptions and principles to estimate transit stationto-station OD matrices [30][31][32][33][34][35][36].Although the transit network operates according to timetable, there are still some differences between transit network OD estimation and HSR time-varying demand estimation.In transit network, passengers don't need to book in advance, they purchase tickets when they arrive at the station, so the arriving or boarding time can be regarded as their expected departure time.However, in HSR system, specifically in China, passengers must book tickets in advance, only those passengers who hold tickets are allowed to get on.All passengers are scrambling for tickets, and occupying the train capacity, which are affected by several factors including timetable, travel cost and train capacity etc.The real departure time of passengers cannot be regarded as their expected departure time.In general, we need a proper method to resolve the HSR time-varying demand estimation problem.
The highlight of this paper is presented below.We utilize 'rooftop' model to figure out the relationship between itineraries and expected departure intervals, and then reversely assign the ticketing volume of itinerary to its corresponding expected departure intervals to obtain timevarying demand.By simulating the ticket-booking process of HSR, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for each OD pair.Based on the precedence relationships of itineraries, we propose two typical situations of all preferable itineraries' tickets sold out order, i.e., for any itinerary, its tickets would be sold out in its first booking phase, and its tickets would be sold last from its first booking phase to last booking phase respectively.According to these two typical situations, two algorithms are proposed to estimate the timevarying demand respectively.Case analyses on OD pair Beijing-Shanghai are presented and the validity analyses of those two methods are further examined.
The rest of the paper is organized as follows.We propose the assumptions and state the details of HSR time-varying demand estimation problem in Section 2. Section 3 develops the Single Booking Phase Reverse Assignment (SBPRA) algorithm and the corresponding case analysis is presented.The Multiple Booking Phases Reverse Assignment (MBPRA) algorithm is proposed and the corresponding case analysis is given in Section 4. In Section 5, validity analysis is presented.Finally, Section 6 concludes the paper.

Problem Statement and Overview of Proposed Approach
In this section, we first summarize the major assumptions for the estimation problem of time-varying demand.Then, we describe the estimation problem of time-varying demand.
After that, the rooftop model and simulated ticket-booking process will be introduced, respectively.At last, based on the booking phases, the reverse assignment method will be introduced.

Assumptions.
The following assumptions are made for the demand estimation problem.Assumption (A4) reflects the current practice of the HSR system operations in China.However, different fares can be readily incorporated in our modeling framework.(A1) HSR system has enough capacity over the service time window to serve all passengers for each OD pair, and the flow shifting between transport modes is not considered.
(A2) All passengers have the same value of time (homogeneous passengers).
(A3) Each passenger chooses the itinerary to minimize his or her travel cost (rational passengers).
(A4) The ticket fare for each itinerary during the whole pre-sale time window is fixed (booking time independent fare).

Problem of Time-Varying Demand Estimation.
Before moving further, the major notations are shown in Appendix A. The time-varying demand estimation problem can be descripted as follows: Given each itinerary flow between each OD pair (, ), we need to estimate the time- where  is the expected departure time for passengers, and [ 0  ,  1  ] is the operation period of OD pair (, ).
For OD pair (, ), let    denote an itinerary, which means a travel scheme adopted by passengers, including trains and transfer stations from station r to station s.Denote   as the itinerary set of OD pair (, ).For any itinerary    ∈   , its cost is defined as    , which includes the in-train time costs, transfer time costs and ticket fees.The flow of    is expressed as    , which can be obtained from the RTS.With the large capacity and high frequency trains in the HSR network, the total demand volume could be obtained by summing all itinerary flows for OD pair (, ).Therefore, the problem of time-varying demand estimation is how to reversely assign all itinerary flows to [ 0  ,  1  ] to obtain the time-varying distribution.

Complexity
Next, we will describe the simulation of the ticketbooking process, and then reversely assign each itinerary flow to its corresponding expected departure time interval to estimate the time-varying demand.

Rooftop Model and Ticket-Booking Process of HSR Passengers.
As HSR passengers must book tickets in advance, which is distinct from the conventional traffic assignment models of public transit, thus ticket-booking process need to be analyzed to model the passenger assignment.
In HSR system, passengers of each expected departure time book tickets of their preferable itineraries in the set of available itineraries.As the ticket-booking process goes on, some itineraries' tickets will be sold out.Then, some passengers have to book the tickets of their preferable itineraries in the set of remaining available itineraries.Hence, the ticketbooking process means that the above process is repeated until all passengers have booked their tickets.
From the above ticket-booking process, it is known that the set of available itineraries would be updated after the time division points when the tickets of any preferable itinerary had sold out.These time division points partition the pre-sale period into several pre-sale time intervals.In each pre-sale time interval, which also can be regarded as a booking phase, passengers choose their preferable itineraries in the current set of available itineraries.Thus, the continuous ticketbooking process can be partitioned into several discrete booking phases following the above method.In each booking phase, passengers' choice behaviors can be described as a rooftop model which will be introduced afterwards.
The rooftop model [37,38] can be described based on the Assumption (A2) and (A3).If there was no itinerary at the time of , he/she would adjust his/her departure time to another itinerary.Let    denotes the departure time of itinerary    ∈   at station .Define  as the unit time fee for passengers who adjust expected departure time, and the travel cost of this passenger choosing itinerary    is | −    | +    .Therefore, based on the Assumption (A2) and (A3), the preferable itinerary chosen by passengers with the expected departure time  can be expressed as: Besides,    is the current set of available itineraries of OD pair (, ).
A simple example of preferable itineraries which are calculated by rooftop model is shown in Figure 1.For a given OD pair (, ), there are 6 itineraries  1  ,  2  , ⋅ ⋅ ⋅ ,  6  , departure times are  1  ,  2  , ⋅ ⋅ ⋅ ,  6  respectively, and the cost of each itinerary is  1  ,  2  , ⋅ ⋅ ⋅ ,  6   respectively, which are shown by the height of the black vertical solid lines in Figure 1.For passengers who want to depart during [ 0  ,  1  ], the extra cost of adjusting their expected departure times for each itinerary is illustrated by red dotted line in Figure 1, and the slopes of those lines are − and .Based on Assumption (A2) and (A3), the set of preferable itineraries which calculated by Eq. ( 1) is times would only book tickets with their current minimum travel costs in the preferable itinerary set   .For instance, passenger who wants to depart at  2  would book tickets of  1  rather than  2  due to the reason that the current travel cost of  1  is lower than  2  for him/her.In addition, for OD pair (, ), define   ⊂    as preferable itinerary set which includes all itineraries calculated by Eq. (1) for any  ∈ [ 0  ,  1  ].We sort every preferable itinerary in   according to the departure time and still express it as   , i.e.,   = { In the above analysis, we can see that these preferable itineraries in   divide the total expected departure time period [ 0  ,  [ ,,0 ,  ,,1 ) , For Eq. ( 2),   (1 ≤  ≤  − 1) is the division point between the expected departure interval ( ), its abscissa value is   , and its ordinate value satisfies the following equation: Then,   can be calculated by the following equation:  Besides, let  0 =  0  ,   =  1  .The ticket-booking process also can be partitioned into several booking phases.In the example of Figure 1, at the beginning of pre-sale period, denoted as the booking phase I, passengers can book tickets in the preferable itineraries set } which is calculated by Eq. (1).According to Eq. ( 2) and ( 4), [ 0  ,  1  ] would be divided into 3 expected departure intervals by   , ( respectively.Passengers who want to depart at  ∈ (   ),  = 1, 2, 3 would book tickets of preferable itinerary    .As the booking process proceeds, when any preferable itinerary's tickets had sold out, such as  1  , this itinerary would be unavailable for passengers.The set of preferable itineraries would be updated to ,  4  ,  6  } by Eq. (1).Then the booking process moves on to booking phase II.In the booking phase II, [ 0  ,  1  ] would be partitioned by the new preferable itinerary set The ticket-booking process is similar to the above process, and repeat it until all passengers have booked their tickets.

Reverse Assignment Based on the Booking Phases.
As we analyzed in Section 2.3, ticket-booking process can be partitioned into several booking phases, and in each booking phase, passengers book tickets in the set of preferable itineraries.Hence, according to the booking phases, we adopt reverse assignment method to estimate the time-varying demand of HSR network.
The ticket booking volume of each preferable itinerary is reversely assigned to its corresponding expected departure interval in each booking phase, and the time-varying demand distribution of each booking phase can be calculated.The total time-varying demand can be obtained by summing all the time-varying demand distributions of all booking phases.
In this paper, without the data about the time division points when the tickets of each preferable itinerary had sold out, we only have the data of each itinerary flow for each OD pair to estimate the time-varying demand.Therefore, the key point of this problem is how to partition ticket-booking process into discrete booking phases, i.e., how to get the ticket sold-out order of all preferable itineraries, and how to determine the ticketing volume of each preferable itinerary in each booking phase.
The sold-out order of all preferable itineraries' tickets is various, and the ticketing volume of each preferable itinerary may be different in each booking phase.However, some itineraries' ticket sold-out order can be determined by their costs with Assumption (A3).
For itinerary    ∈   , if itinerary  ℎ  ∈   , ℎ ̸ =  satisfy the following Eq.( 5) for any  ∈ [ 0  ,  From the precedence relationship constraint, the ticket sold-out order of them is that the tickets of    had sold out earlier than that of  ℎ  .With the calculating formulation of travel cost of each itinerary, Eq. ( 5) can be equal to the following: Thus, Eq. ( 6) can be used to easily check the precedence relationship between any two itineraries.
The  (1 ≤  < ).Besides, if an itinerary has no precedence relationship with any other itineraries, this single itinerary also can be regarded as a precedence relationship chain.Hence, there are many precedence relationship chains in   .The ticket sold-out order of all itineraries is constrained by the precedence relationship.The itinerary number of a precedence relationship chain can be regarded as its length.Denote the longest precedence relationship chain in   as   , and the itinerary set and itinerary number of   are denoted as    and  respectively.For further estimating the time-varying demand with the precedence relationship, we propose the following assumptions.
(A5) For each expected departure time, the ticketbooking process is continuous and lasts the entire pre-sale time period, and for all expected departure times, the ticketbooking processes are synchronized during the pre-sale time period.
(A6) Passengers' booking tickets of itineraries in the longest precedence relationship chain   would last the entire pre-sale time period.The ticket-booking process would be partitioned into only  booking phases by the ticket soldout time points of  itineraries in   .For the Assumptions (A5) and (A6), it should be noted that the tickets of each itinerary which is not in    would also have sold out in one of the above  ticket sold-out time points.For instance, in Figure 1,  1  ≺  2  ≺  3  is   .Based on Assumption (A3), (A5) and (A6), the ticketbooking process would only be partitioned into 3 booking phases by the ticket sold-out time points of  1  ,  2  and  3  .For precedence relationship chain  6   , the ticket sold-out time points of  6   would be at the end of booking phase I or II or III.Hence, the tickets of the  ℎ ,  = 1, 2, ⋅ ⋅ ⋅ ,  itinerary in   would be sold only in booking phase .For other itineraries not in    , the sale of their tickets may last for more than one booking phase.For instance, in Figure 1, for itinerary  6   , its tickets may be sold for more than one booking phase, and its ticket sold-out time point may be the end of any booking phase.
For any    ∈   , denote the first and the last booking phase when its tickets can be sold as m  and m  respectively.In the following content, m  and m  is described as the first and the last booking phase of , ⋅ ⋅ ⋅ , P  (), which will be used to calculate the value of m  for    ∈   in the Algorithm 2. In Algorithm 2, P () = {   |    ∈   , m  = } can be calculated for  = 1, 2, ⋅ ⋅ ⋅ , .We denote the last booking phase scheme of   as P (1), P (2), ⋅ ⋅ ⋅ , P ().
The example of the calculations of the first and the last booking phase for each itinerary in Figure 1 can be described as follows.According to the first and the last booking phase partition algorithm, we can obtain the following first and the last booking phase scheme.
The ticket sold-out order of itineraries in each precedence relationship chain is constrained by the precedence relationship.For instance, in Figure 1, for precedence relationship chain  4   ≺  5  , due to the reason that m 4  = m4  = 1, the tickets of  4   would only be sold in booking phase I. Due to the reason that m5  = 2, m5  = 3, the tickets of  5   can be sold in booking phase II or in booking phase II and III.In conclusion, due to the reason that the ticket sold-out time point of preferable itinerary may be at the end of different booking phases, we proposed 2 typical situations of all preferable itineraries' tickets sold out order to estimate the time-varying demand respectively.
Typical Situation 1.For any itinerary    ∈   , its tickets would be sold out in its first booking phase m  .
Typical Situation 2. For any itinerary    ∈   , the sale of its tickets would last from the first booking phase m  to the last booking phase m  .For each typical situation, the HSR passenger timevarying demand estimation can be described as follows: Firstly, partition the ticket-booking process into several booking phases and figure out the set of preferable itineraries in each booking phase.Secondly, in each booking phase, divide the total expected departure time period [ 0  ,  1  ] into expected departure intervals based on the set of preferable itineraries.Thirdly, ticket booking volume of each preferable itinerary is reversely assigned to its corresponding expected departure interval to obtain the time-varying demand distribution in each booking phase.At last, sum the time-varying demand distributions of all booking phases to obtain the time-varying demand.
The following content are based on Typical Situation 1 and 2 to design two corresponding time-varying demand estimation algorithms.

Single Booking Phase Reverse
Assignment Algorithm where |  (   )| =  ,, −  ,,−1 .For the example in Figure 1, the single booking phase reverse assignment can be described as follows.Based on Assumption (A3), (A5), (A6) and Typical Situation 1, the booking phase scheme is expressed as follows.

Complexity
In booking phase I, using Eq. ( 8), evenly assign the flows  1  ,  4  and  6  to their corresponding expected departure intervals  1 ( ) to obtain the time-varying distribution of booking phase II.Then in the booking phase III, evenly assign the flow  3   of preferable itinerary  3   to its corresponding expected departure interval  3 ( ) and obtain the time-varying distribution of booking phase III.At last, sum the time-varying distributions of all booking phases to obtain the time-varying demand.

Case Analysis.
We apply the data (shown in Appendix E) of OD pair Beijing-Shanghai on December 1 st 2015 from the RTS into the SBPRA algorithm.There are 34 itineraries for OD pair Beijing-Shanghai, and the departure time, cost and flow of each itinerary are given in Table 1.The total effective operation period of this OD pair [ 0  ,  1  ] = [6:00, 20:00].The average monthly residential incomes of Beijing and Shanghai are 7086 RMB and 6504 RMB in 2015 respectively [39,40].Based on 22 working days in a month and 8 working hours in a day, the average income can be expressed as 0.67 RMB per minute and 0.62 RMB per minute respectively.We use average residential income to express the unit time fee of adjusted expected departure time, i.e.,  = (0.67 + 0.62)/2 = 0.65 RMB per minute.
We calculate the passenger time-varying demand   (),  ∈ [6:00, 20:00] for OD pair Beijing-Shanghai with SBPRA algorithm.Firstly, use the Algorithm 1 to calculate the first booking phase scheme P  (1), P  (2), ⋅ ⋅ ⋅ , P  ().Secondly, do   () ← P  (),  = 1, 2, ⋅ ⋅ ⋅ ,  to obtain the booking phase scheme   (1),   (2), ⋅ ⋅ ⋅ ,   ().Thirdly, for  = 1, 2, ⋅ ⋅ ⋅ , , we calculate the expected departure interval   (   ) for all    ∈   () shown in Figure 3 by Eq. ( 2) and (4); then the ticket booking volume of each preferable itinerary in booking phase  = 1, 2, ⋅ ⋅ ⋅ ,  is reversely assigned to its corresponding expected departure interval, shown in Figure 4; and the distribution of the reverse assignment in booking phase  = 1, 2, ⋅ ⋅ ⋅ ,  is illustrated in Figure 5; at last, the accumulated time-varying demand distribution is shown in Figure 6.In The cost of each preferable itinerary in booking phase I The cost of each perferable itinerary in booking phase II The cost of each preferable itinerary in booking phase III The travel cost for passengers in booking phase I The travel cost for passengers in booking phase II The travel cost for passengers in booking phase III     4. Figure 4 illustrates the expected departure interval and travel cost of each preferable itinerary in each booking phase.From booking phase I to III, the number of preferable itinerary is decreasing rapidly, the time range of each preferable itinerary's expected departure interval is wider and the height of vertical solid line which represents the cost of each preferable itinerary is rising.
Figure 5 shows the change of ticketing volume of each preferable itinerary.From booking phase I to III, the height of vertical solid lines, regarded as ticketing volume of each preferable itinerary, is declining drastically.Based on this information, the feature of tickets-booking process simulated by the SBPRA algorithm can be described as follows: (1) For OD pair (, ), passengers choose their preferable itineraries and book their corresponding tickets with the minimum travel cost, so the sold-out order of all preferable itineraries' tickets is from low cost to high cost.
(2) For OD pair (, ), all itineraries' tickets had sold out in the first booking phase, it causes that most passengers book tickets at early booking phase, the travel cost of those passengers are relatively low and the adjusted expected departure time ranges are relatively narrow.In contrast, a small percentage of passengers who book tickets at late booking phase have to choose those itineraries with higher cost and need to adjust their expected departure time in a wider range.
We reversely assign each preferable itinerary's ticket booking volume in Figure 5 to its corresponding expected departure interval in Figure 4 to obtain the time-varying demand distribution in each booking phase in Figure 6.Sum the distributions in Figure 6 from booking phase I to III to get the accumulated time-varying demand distribution in Figure 7.The red solid line in Figure 7 is the time-vary demand of OD pair Beijing-Shanghai calculated by SBPRA algorithm, and the Table 5 shows the numerical results of time-varying demand in details.
The method of polynomial fitting is adopted for fitting the above distribution of time-varying demand, and the result is shown in Figure 8.It can be seen that the travel demands before around 7:30 and after 17:30 are relatively low.From 9:00 to 11:00 and 14:00 to 16:00, there are two demand peaks.Around 12:00, the drop of travel demand is probably due to the approaching lunch time.
For the sensitive analysis of parameter  in the SBPRA algorithm, we also calculate the numbers of booking phases and time-varying demand distributions for different values , shown in Table 6 and Figure 9 respectively.From Table 6, it can be seen that with larger value of , passenger is more concerned about the cost of adjusting the expected departure time, which results in the decreasing of numbers of booking phases.From Figure 9, with the increasing of parameter , the fluctuation of the time-varying demand distribution is increasing.

Multiple Booking Phases Reverse Assignment Algorithm
Since the above SBPRA algorithm is the estimation method based on Typical Situation 1, each itinerary flow is reversely assigned to corresponding expected departure intervals in its first booking phase.In this section, based on Typical Situation Complexity 11 Table 3: The precedence relationship.
Table 4: The first booking phase scheme.

Case Analysis.
We apply the data (shown in Appendix E) of Beijing-Shanghai HSR on December 1 st 2015 from the RTS to the MBPRA algorithm, and the parameters setting are the same as in Section 3.2.The first and the last booking phase scheme calculated by Algorithms 1 and 2 respectively are shown in Tables 4 and 7. Based on the first and last booking phase scheme, the booking phase scheme is obtained by Eq. (10), shown in Table 8.We can see that the continuous ticketbooking process is partitioned into 3 booking phases.
Figure 11 illustrates the expected departure interval of each preferable itinerary in each booking phase.From booking phase I to III, the cost of each preferable itinerary is rising, and there are no other obvious trend of changes.Figure 12 shows the change of ticketing volume of each preferable itinerary.From booking phase I to III, the ticketing volume of each preferable itinerary is declining gradually, and the declining speed is slower than SBPRA algorithm.
For OD pair (, ), passengers choose their preferable itineraries with the minimum travel cost and book their corresponding tickets, and each itinerary's tickets remain onsale from its first to the last booking phase.Those conditions cause most passengers to book tickets at the early booking phase with a relatively lower travel cost.As the booking process goes on, few passengers purchase tickets at the late booking phase with a relatively higher travel cost.The decline speed of ticketing volume and the increase speed of travel cost simulated by the MBPRA algorithm are gentler than the SBPRA algorithm.In conclusion, comparing the two solutions by the MBPRA algorithm and the SBPRA algorithm, passengers are less sensitive to the changes of travel cost in the former algorithm.
Figure 13 shows the time-varying demand distribution in each booking phase, and the accumulated time-varying demand distribution of the MBPRA is illustrated in Figure 14.Table 9 shows the numerical results of time-varying demand by the MBPRA algorithm.
Polynomial fitting is adopted for the above distribution of time-varying demand results, and we get demand distribution curve of Beijing-Shanghai, shown in Figure 15.We can see that the fluctuation trends of time-varying demand distribution from the MBPRA algorithm and the SBPRA algorithm are similar.It means that the solution space between the MBPRA algorithm and the SBPRA algorithm is relative narrow.
For the sensitive analysis of Parameter  in the MBPRA algorithm, the time-varying demand distributions with different parameters  are shown in Figure 16.From the SBPRA and MBPRA algorithms, it is obvious that they have the same number of booking phases.From Figure 16, it can be seen that the change trend of the time-varying demand distribution have the same characteristic comparing with Figure 9.
We apply the data (shown in Appendix G) of OD pair Beijing-Tianjin in December 1 st 2015 to analyze the validity of the SBPRA algorithm and the MBPRA algorithm.There are 129 itineraries of OD pair Beijing-Tianjin on that day.The effective operation period of this OD pair is [ 0  ,  1  ] = [6:00, 23:00].The average monthly residential income of Beijing and Tianjin are 7086 RMB and 4944 RMB in 2015 respectively [40, 41], and the average income can be expressed as 0.67 RMB per minute and 0.47 RMB per minute respectively.We set  = (0.67+0.47)/2 = 0.57 RMB per minute.The comparison between hourly transport volumes from RTS and the results from the SBPRA algorithm and MBPRA algorithm are shown in Table 10.
From the Table 10, we can see that the error rates of the SBPRA algorithm and the MBPRA algorithm are 8.64% and 6.37% respectively, which are relatively low and verifies those two algorithms.Besides, the MBPRA algorithm has a lower error rate than the SBPRA algorithm, which implies that ticket-booking process of this OD pair on December 1 st 2015 is closer to Typical Situation 2.

Conclusion and Further Studies
This paper focuses on the problem of HSR time-varying demand estimation.By simulating ticket-booking process, we reversely assign the ticketing volume of each preferable itinerary to its corresponding expected departure interval in each ticket-booking phase, and then sum the demand distributions in all booking phases to obtain the time-varying demand.Owing to the variety of the sold-out orders for all preferable itineraries' tickets and only the data of the itinerary flow, the precedence relationship is introduced to constrain the ticket sold-out order of all itineraries for each OD pair.Based on the precedence relationship of itineraries, two typical situations are proposed, and the SBPRA algorithm and the MBPRA algorithm are designed.The case analysis shows that the results of those two algorithms can better reflect the time-varying characteristics of HSR passenger demand, and the fluctuation of those two distributions are similar, but the SBPRA algorithm results are more relevant to the itinerary cost differences.Numerical analysis have shown that the error rates of the SBPRA algorithm and the MBPRA algorithm are 8.64% and 6.37% respectively.They have rather good estimation accuracy, which validate those two algorithms.
The current research, as a first step to estimate timevarying demand in HSR, can be extended along several avenues as follows: (1) This paper only considers travel cost for passengers with the same unit time fee, but the unit time value may vary for different passengers.Further studies can classify passengers into several categories with different socio-economic characteristics (e.g.income level).Besides, different class seats could be considered.(2) This paper uses simulative method to estimate time-varying demand by partitioning continuous ticket-booking process into discrete booking phases according to two typical situations.If more detailed information is accessible, such as ticket-booking time of each passenger, then the time-varying demand can be estimated by the actual sold-out order of preferable itineraries.(3) We will study the estimation problem of the day-to-day dynamic demand for HSR system in the further research.

B. The First Booking Phase Partition Algorithm
See Algorithm 1.

C. The Last Booking Phase Partition Algorithm
See Algorithm 2.

D. Single Booking Phase Reverse Assignment Algorithm
See Algorithm 3.

E. The Ticket Booking Data of Each Itinerary of OD Pair Beijing-Shanghai on December 1st 2015
See Table 11.

F. Multiple Booking Phases Reverse Assignment Algorithm
See Algorithm 4.

G. The Ticket Booking Data of Each Itinerary of OD Pair Beijing-Tianjin on December 1st 2015
See Table 12.

Figure 3 :
Figure 3: The flow diagram of the SBPRA algorithm.

Figure 4 :
Figure 4: The cost and the expected departure interval of each preferable itinerary in each booking phase from the SBPRA algorithm.
of each preferable itinerary in booking phase I The ticketing volume of each preferable itinerary in booking phase II The ticketing volume of each preferable itinerary in booking phase III

Figure 5 : 5 10 15 DemandFigure 6 :
Figure 5: The ticketing volume of each itinerary in each booking phase from the SBPRA algorithm.

Figure 7 :Figure 8 :
Figure 7: The accumulated time-varying demand distribution of the SBPRA algorithm.

𝑃Figure 9 :
Figure 9: The time-varying demand distribution from SBPRA algorithm with different parameter .

Figure 11 :
Figure 11: The cost and the expected departure interval of each preferable itinerary in each booking phase from the MBPRA algorithm.

Figure 12 :Figure 13 :Figure 14 :Figure 15 :
Figure 12: The ticketing volume of each itinerary in each booking phase from the MBPRA algorithm.

Table 9 :Figure 16 :
Figure 16: The time-varying demand distributions from MBPRA algorithm with different .
1  ], then the tickets of  ℎ  wouldn't be sold until the tickets of    had sold out.
Input The effective operation period [ 0 ,  1 ] and the itinerary set   of OD pair (, ); the cost    , and the departure time    of itinerary    ∈   ; the unit time fee  for adjusting expected departure time for passengers Output ; m  , for    ∈ Input The effective operation period [ 0 ,  1 ] and the itinerary set   of OD pair (, ); the cost    , and the departure time    of itinerary    ∈   ; the unit time fee  for adjusting expected departure time for passengers; the first booking phase scheme P

10 Complexity Table 1 :
The data of each itinerary of OD pair Beijing-Shanghai.The   and m  of    are shown in column  = 2 of

Table 2 :
The precedence relationship.

Table 5 :
The time-varying demand from the SBPRA algorithm.

Table 6 :
Parameter  and its corresponding number of partition booking phases.
. As a result, the set of preferable itineraries   () in the booking phase  can be calculated as follows.
Multiple Booking Phases Reverse Assignment Algorithm.According to Typical Situation 2, itinerary    ∈   would be preferable for passengers in the booking phase m  , m  + 1, ⋅ ⋅ ⋅ , m .For any itinerary    ∈   , its flow    is evenly assigned to its all corresponding expected departure intervals in booking phase m  , m  + 1, ⋅ ⋅ ⋅ , m  , i.e., the ticket booking volume of preferable itinerary    ∈   () in booking phase  is allocated by a proportion of this itinerary's flow    .The proportion is equal to the ratio of its expected departure interval's time range |  (   )| in booking phase  to its all corresponding expected departure intervals' time range ∑ (   )| =  ,, −  ,,−1 is the time range of   (

Table 7 :
The last booking phase scheme.

Table 8 :
The booking phase scheme.
The ticketing volume of each preferable itinerary in booking phase I The ticketing volume of each preferable itinerary in booking phase II The ticketing volume of each preferable itinerary in booking phase III Index of OD pair, (, ) ∈   Index of booking phase,  = 1, 2, ⋅ ⋅ ⋅ ,   Index of preferable itinerary,  = 1, 2, ⋅ ⋅ ⋅ ,  , ℎ,  Index of itinerary,    ,  ℎ  ,    ∈   Parameters [ 0  ,  1  ] Effective operation period of OD pair (, )    Itinerary of OD pair (, ),    ∈      Preferable itinerary of OD pair (, ),  Unit time fee for passengers who adjust expected departure time   Time division point between ( The first booking phase of

Table 11 :
The ticket booking data of each itinerary of OD pair Beijing-Shanghai on December 1 st 2015.The itinerary number of the longest precedence relationship chain     () Time-varying demand of OD pair (, ),  ∈ [ 0  ,  1  ]