^{1}

^{1}

Out of waiting times spent in rail stations on boarding platforms, some part can be reinvested by the trip-makers to optimize their positions of boarding and save on travel time for the rest of their trips. This paper provides a stochastic model, in which user’s journey is decomposed into phases of, successively, walking in the access station, platform positioning, waiting for boarding, train riding, and walking in the egress station. Walking speed and target position are modeled as individual factors, and in-station distances as random variables. Service timetable is exogenous. This makes egress times and exit instants random variables that are characterized by distribution and mass probability functions under closed-forms, for both single and distributed walking speeds. Specific statistical distributions are shown to ease computation. The resulting PDF formulae make likelihood functions of the model parameters. Maximum likelihood estimation is proposed and applied to a case study of commuter rail line in Paris: journeys between stations Vincennes and La Défense along line A of the Regional Express Railways. Based on data from Automated Fare Collection and Automatic Vehicle Location systems and pertaining to an individual user, satisfactory results were obtained.

Passenger waiting time is one of the most crucial factors for transit design and planning, related to passenger satisfaction and to measure public transport (PT) quality of service [

Despite the attention paid so far to individual waiting time reuse in PT station, a related issue seemed to remain unexplored: the reuse of waiting time for passenger repositioning along the boarding platform. Of course, this is of interest for railway-based transit submodes only, since the platform lengths of these modes extend from some dozen meters up to some hundred meters, e.g., up to 200 m in Paris. Assuming that individual walking speeds range from 0.8 to 1.7 m/s, repositioning time from one platform end to another may reach 0.5 to 3.5 min, indeed a valuable gain for passengers in their daily schedules. The influence of waiting time on individual walk on boarding platform can be significant. Such extent makes the “longitudinal” positioning of individual passengers along their trains a matter of significance for their journeys, since it involves walking times along the platform at both access and egress stations, together with waiting time and comfort on the boarding platform as well as in-vehicle in relation to the distribution of passengers along the train. This pushed frequent passengers (e.g., commuters) to optimize their longitudinal position along the boarding platform with respect to their egress station [

Railway operators, on their side, consider the longitudinal positioning of candidate riders in relation to train dwelling times: higher passenger densities at some spots along the platform will not only require more boarding time but also slow down the egress of alighting passengers. Dynamic information via travel assistants or specific signage such as with variable color panels has recently been implemented on an experimental basis [

The objective of this paper is to estimate passenger longitudinal repositioning distance on boarding platform during train waiting and underlying distributions of individual walking speed and distances. Building upon our previous stochastic models [

Concerning the proposed passenger flow stochastic model, over the last decade, a new branch of knowledge was emerged, the modern data measurement and data-driven statistical analysis in PT field. Modern data, Automated Fare Collection (AFC) data and Automatic Vehicle Location (AVL) data, among others have become available either for PT service evaluation and improvement, or for passenger flow analysis. Passenger flow study has given rise to the stochastic modeling and statistical estimation of fine individual passenger travel phenomena by trip leg or in station in rail transit system, a closed black-box. Several relevant studies are reviewed in greater details as below; for other applications about AFC data see [

The stochastic features of passenger journeys were modeled by [

A key element in the transit model is the probability of matching between a given train run serving the boarding station and a given passenger journey. To generalize the previous works in [

In the meanwhile, in more recent studies, econometric methods were also applied for modeling and estimating individual passenger in-station underlying statistical distributions. Inspired from the econometric approach using highway individual vehicle toll data in [

Since the study on passenger longitudinal repositioning distance along boarding platform during train waiting was scarce, passenger longitudinal repositioning distance was primarily modeled in our most recent work [

This paper is organized as follows. Section

Stochastic models on passenger repositioning between a pair of O-D stations (called also a trip leg) are depicted in this section. This study is built upon the time-space diagram of traffic flow theory, the kinematic theory, and basic econometric theory to understand passenger individual in-station movement related to train run choice by trip leg along an urban rail transit line.

We first state the physical model of one individual passenger making a train journey by availing oneself of train runs and waiting time prior to boarding (Section

Notations.

| |
---|---|

| |

| Passenger number and passenger set by station leg |

| Passengers’ walking speeds on walking links in access and egress stations |

| Passenger longitudinal repositioning distance |

| Passenger |

| Passenger walking-in distance and time from tap-in gate to platform access point in access station |

| Passenger walking-in distance from platform entrance to platform boarding point in access station |

| Passenger walking-out distance from platform alighting point to platform exit in egress station |

| Passenger walking-out distance and time from platform egress point to tap-out gate in egress station |

| Total walking-in and total walking-out times in access and egress stations |

| Passenger apparent waiting time |

| Passenger residual waiting time on waiting (also boarding) point, |

| Passenger total time cost in access station for boarding train run |

| |

| Train run number and train run set along a line |

| Number and subset of train runs for passenger |

| Feasible number and subset of train runs for passenger |

| Train run |

| Train run speeds between access and egress stations |

| |

| Station number and station set along a line |

| Access station and egress station |

| |

| Cumulative Distribution Function (CDF) of variable |

| Probability Density Function (PDF) of variable |

| Probability to take train run |

| Threshold value of repositioning distance |

| Kernel function of repositioning on platform |

| Studied period, |

| Parameter vector and its estimated value |

| Space of parameter vector |

Consider here a noncyclic urban rail transit line with

Let us consider a transit passenger (user), denoted by index

Time-space diagram: train and passenger trajectories.

In access station, the walking-in distance from tap-in gate O to boarding platform entrance A is denoted as

Let us now consider the longitudinal dimension of platforms. In urban rail transit system, station platforms are long objects of relatively modest width (e.g., several meters), whereas their lengths range from some dozen meters in tramway stations to a couple of hundred meters in stations of metro, urban or suburban train. In the latter case, passengers are expected to walk up to a given position for waiting and boarding, boarding point M with abscissa

Let us denote by

The time available to

Then, on boarding platform, the apparent waiting time of passenger

Assume that the passenger is a rational decision-maker willing to minimize his or her exit time and walks along boarding platform as much as possible, yielding

Thus, the journey time

We model firstly passenger journey with distributed walking distances and constant walking speed; the stochastic model is integrated with respect to walking (positioning) distance

As a general notation, for a Random Variable (RV)

Urban rail transit stations vary from simple stations at grade providing access to one line only, to complex transit hubs connecting several lines and equipped with several platforms on several floors. Most of them are underground. Whatever the case, the distances between station tap gates and platform entrances or exits extend to some dozen meters at least and up to some hundred meters. Since the walking-in distance

A passenger

Train run

The positioning distance on boarding platform depends on the train run that is taken and the walking-in distance

Two cases must be distinguished

either

or

The former case of total positioning can happen only if

In the alternative case of partial positioning

Bringing together the two cases, a CDF

Thus the positioning distance

Of course, the RV does only exist when the matching probability is strictly positive, i.e., iff

The positioning distance

So

The resulting value

On replacing

Thus the total walking-out time

To integrate with respect to

Concerning the tap-out instant

By integrating with respect to

Thus the tap-out instant

From the CDF of the total walking-out time and tap-out instant, either conditioning on

The associated PDFs are

It should be noted that the above stochastic model assumes that a passenger has constant velocity motion in access and egress stations. From a practical point of view, this assumption may not be realistic and thus too restrictive. Extensions will be considered in next subsection to consider a distributed walking speed.

When passenger walking speed

The walking speed

In addition to this intraindividual diversity, there is an even larger diversity between individual passengers, since people differ in their respective walking abilities. Young adults can walk faster than elderly people and are likely to be more hurried. People with luggage or young child either walking or in a stroller walks more slowly than the average adult. A rough indication about the statistical distribution of walk speeds for a typical population of transit users in the urban setting was close to a normal distribution with mean of about 0.90 m/s and standard deviation of about 0.20 m/s or to a uniform distribution ranging from 0.58 to 1.24 m/s in the Appendix, the cases with waiting time integrated in walking time.

Assuming that passengers’ walking speeds on walking links in access and egress stations obey the same statistical distribution, it is easy to extend the stochastic model of passenger repositioning to a diversity of walking speeds, by considering walking speed in a given population

We still denote

To integrate with respect to

Then, conditioning on

Conditioning on

Denoting

The exit instant

Thus, both the total walking-out time and the tap-out instant are endowed with closed-form CDF that involve the tap-in instant

The analytical formulae obtained so far involve the integration of specific functions along one or two scalar dimensions, namely one dimension of space (with respect to distance

In this section, we firstly put forward specific distributions that are suitable to our purpose (Section

The obtained general PDF functions constitute the likelihood functions of all assumed parameters. To prepare for further work on the MLE of those parameters, some hints of PDF computation under ad hoc selection of distributions are provided.

For a given pair of access and egress stations, we take

Individual speed follows a normal (Gaussian) distribution with mean

Walking-in distance

Walking-out distance

Let us establish a property for a normal RV

As

So the final result is

The PDF formulae (

As

The left part is computed firstly and produces the two terms:

So the product of the two terms is ready to compute the left part in

To obtain

The first one (L-a) must be integrated for

The second term (L-b) is dealt with similarly, yet with distinction between two subdomains depending on which function is greater between

If

In the other case where

Concerning the right part, there is

where

As for integration over the distribution of walking speeds, the right part breaks into two bricks to which the lemma applies.

By taking into account the previous distributions of

With respect to tap-out instant: (a) bricks of PDFs and (b) bricks of CDFs.

Previous analytical formulae constitute the cores of our stochastic models. They derive analytical closed-form formulae that provide likelihood functions, which are tractable under a specification of basic statistical distributions. The stochastic models are theoretical constructs that involve human behaviors of trip-making in relation to the dynamic process of train runs. Such a theoretical model can be applied to particular cases, notably so by estimating the values of its parameters so as to make its outcomes replicate observed values well.

As reported by [

In this section, we put forward an approach of MLE (Section

Let us assume here that TITO

The joint observation of a sample

The MLE consists to set up the value of parameter vector

The estimator of MLE can be applied to our stochastic models for either an individual passenger observed on several journeys or a set of passengers to differentiate ‘intra-’ versus ‘inter-’ individual cases. In the former case [

The estimator of MLE is endowed with powerful statistical properties that are well known in econometrics [

In practice, the estimator searches for the estimate

Modeling and optimization schema based on MLE.

The available AFC dataset was exploited by a specific dynamic O-D matrix inference scheme devised in [

Based on parameter estimates, we can model each journey in the observed sample. The outcomes fall into three categories: (i) user’s attributes of walking speed

Matching probabilities associated with a given journey may be analyzed in three steps: (i) to identify the number

Furthermore, based on the parameter estimates, we can make inference about some journey items that are not observed per se. Such ex-post analysis in a given journey of a passenger can be performed along the passenger’s trajectory in the following way. In this model, assume that in each trip

mean of truncated exponential distribution,

repositioning distance,

residual waiting time, the waiting time in excess of repositioning time,

mean of truncated exponential distribution,

Hence, tap-out time

The models are applied to a real case study, the busiest urban rail transit line RER A in Paris area, France, on the basis of AFC data provided by IdFM (ex STIF) and AVL data provided by RATP.

After introducing the case, observations of trip-making, and train traffic (Section

There are two main systems of urban rail transit in Paris area [

The line RER A is the busiest urban rail transit line in Paris area and maybe Europe, carrying more than one million passengers every workday [

The train time headways on the central trunk range from 2 min at peak to 10 min off peak. Our study focused on the O-D pair between Vincennes and La Défense on the central trunk. Each station has a number of entrances in relation to its importance, from 2 at Vincennes to 6 at La Défense. The topological structures of O-D pair Vincennes and La Défense are detailed in Figure

Sketch of stations’ topological structures: Vincennes and La Défense.

AVL and AFC datasets were made available to us by the line operator RATP and the mobility authority IdFM, respectively, for a period in March 2015 from the 16th to the 29th, excluding the 21st, 22nd, and 23rd. AVL data of RATP includes trains’ arrival and departure times in stations. Out of the AFC data pertaining to the O-D pair Vincennes and La Défense in either direction, we selected one sample per direction, both for a given passenger with maximum number of such journeys during the period. As it turns out, the two busiest cards are identical, with 15 trips in Case 1 from La Défense to Vincennes and 16 trips in Case 2 from Vincennes to La Défense. Figure

Trips made by the busiest passenger during nine days in March 2015.

The admissible spaces of scalar parameters were specified on the basis of field measurement or literature on urban mobility. As for in-station walking distances, a range of

Parameter estimation for a single passenger by MLE.

| | | | | | ||
---|---|---|---|---|---|---|---|

| | | | ||||

| | | 1.70 | 0.054 | | 1.19 | 0.21 |

| | 0.33 | 0.059 | | 0.07 | 0.01 | |

| | | 0.017 | 0.015 | | 0.73 | 0.03 |

| | 126.86 | 14.72 | | 106.29 | 10.14 | |

| | | 0.91 | 0.042 | | 0.10 | 0.01 |

| | 93.49 | 2.78 | | 70.52 | 28.33 | |

| | | 101.46 | 8.28 | | 76.14 | 6.35 |

Log-likelihood function | -47.48 | -49.01 |

While the log-likelihood function of the model gets its maximum value, parameters’ estimates and their Standard Deviations (SDs, approximations) are obtained in Table

In all, the consideration of an individual passenger enabled us to recover meaningful information about his or her trip-making behavior. The target repositioning distances are significant, at about 101 or 76 m depending on the direction. Combined to the respective estimates of average speeds, these distances correspond to repositioning time of about 60 s in both cases. This value is close to half of a time headway at peak hours. The in-station distances exhibit a mirror effect at Vincennes station (similarity between

Mean walking speed and mean walking distances are calculated based on parameter estimates and illustrated in Table

Indicators’ mean values.

| | | | | ||
---|---|---|---|---|---|---|

| | | | |||

| | 1.70 | 0.33 | | 1.19 | 0.07 |

| | 186.75 | 59.89 | | 107.66 | 1.37 |

| | 94.59 | 1.10 | | 80.52 | 10.00 |

| | 101.46 | 8.28 | | 76.14 | 6.35 |

| NA | 1.83 | NA | NA | 1.51 | NA |

| NA | 0.93 | NA | NA | 1.28 | NA |

In the Appendix, estimation results for a former model neglecting the repositioning behavior are recalled, for the same O-D pair but a single day of observation and a population set of passengers. The journey is from Vincennes to La Défense, which corresponds to Case 2 here. There is much agreement between the intraindividual estimates of Case 2 and the interindividual estimates of model M1 (normal-distributed walking speed) in the Appendix, except for the shift parameter of in-station distances on the egress side. In fact,

Similar to Bayesian analysis, parameter estimates are used for reanalyzing train run occurrences and passenger trip attributes. It will check the rationality of previous results.

Based on the sampled data and parameter estimates, the journey elements are inferred following the lines given in Section

Run number and run probabilities per trip.

A disaggregate analysis of passenger individual trip attributes is proposed by using the passenger trip model proposed in Section

Inferred results about: (a) tap-out time; (b) repositioning distance; and (c) residual waiting time.

This section assesses the outreach and limitations of our model and points to directions for further research.

This paper provided a stochastic model of passenger trip-making along a transit journey by urban rail line, with explicit representation of individual positioning along the boarding platform and the optimizing behavior to save on travel time for the rest of the trip.

The behavioral postulate was appropriate for passengers well aware of the trip conditions at their egress station. This fits well commuters—hence the vast majority of transit users at peak periods—and also customers availing themselves of “travel assistant” applications on their smartphones.

The stochastic model was easy to use in the perspective of simulation, as it followed the physical sequence of phases in a journey path (walking in, platform positioning and waiting, train riding, and walking out). It could readily be applied as a submodel in the frame of a traffic assignment model to a transit network.

While the simulation ability was demonstrated in the case study, the paper was primarily oriented to the estimation perspective: analytical formulae were given to characterize the statistical distributions of egress times and exit instants that stem from the set of modeling assumptions. The CDF and PDF formulae conveyed the influences of individual attributes and behavior (speed and target relocation distance), along with those of local conditions, i.e., in-station distances for the walking phases.

In the estimation perspective, we used the PDF formulae as likelihood functions for the model parameters. We put forward particular yet realistic enough specifications for the statistical distributions, so as to make numerical computation more tractable.

An application was carried out to an O-D pair of stations along a busy rail line in Paris. AFC and AVL data were extracted for the trips of an individual user over a two-week period. Valuable information was recovered from statistical estimation and posterior inference, notably a time saving of about 1 min owing to platform repositioning along 70 or 100 m, depending on the journey direction. This indicated that the estimation scheme was able to capture fine phenomena and also that the repositioning phenomenon had a limited importance on the journey travel time of this individual user.

This also demonstrated once again the positioning behavior of train users along boarding platforms, which was of interest to railway operators for passenger traffic management under severe crowding. The consequences for station layout and flow orientation ware traced out in related work [

Where AFC and AVL datasets are available, our model can easily be applied to estimate the distance and time components of users’ journeys in a gate-to-gate setting which represents quality of service better than just the service quality of the train runs. The identification of positioning will make the estimation of the remaining time components more realistic and reliable. The individual behavior, as postulated and estimated on the basis of empirical data, can easily be simulated for users’ journeys whatever the availability status of observations is, because the behavioral structure is endowed with replicability.

Despite its finesse, the stochastic model in its current version does not capture congestion phenomena: neither in-vehicle crowding or the potential restrictions (i.e., the probability of fail-to-board and the related issue of passengers “left behind” by trains), the crowding of platforms and its potential influence on individual positioning, nor the crowding of platform access points and especially egress points, which may entail queuing and delay among exiting passengers.

So the consideration of crowding phenomena makes up a first direction for further research on passenger behavior along a rail journey.

A second direction is to devote more attention to the in-station phases, especially so for vertical pedestrian elements that influence individual speed under congestion as well as free-flow conditions. While these issues are well known in the micro-simulation of pedestrian traffic (cf. the Legion and Viswalk modeling software, among others), their estimation on the basis of AFC and AVL data is an open issue.

A third direction for research is to extend the stochastic model with platform repositioning to more complex trip patterns that involve transfers: this is our next objective.

Lastly, more detailed data of users’ trajectories are available from smartphones owing to applications that monitor geolocation data from one or several sources–GPS, GSM, or beacons Wifi or Bluetooth. Indeed, location data collected every second say and with fine GPS or Galileo accuracy constitute ideal material for the refined analysis of passenger trip-making. Such research remains to be done for large underground transit stations, where satellite or beacon signals are impeded or modified by local layout—corridors, walls, floors, and ceilings.

The results of previous models without passenger longitudinal walking distance in [

Estimated parameters’ values by MLE in M1 and M2.

| | | | ||
---|---|---|---|---|---|

| | | | ||

M1: | | 0.90 | 0.0023 | 0.58 | 0.07 |

M1: | | 0.20 | 0.0007 | 1.24 | 0.13 |

| | 0.78 | 0.0048 | 0.32 | 0.18 |

| | 83.49 | 0.7314 | 72.32 | 6.81 |

| | 0.13 | 0.0002 | 0.18 | 0.02 |

| | 204.30 | 0.7998 | 264.46 | 26.66 |

Log-likelihood function | -109.85 | -138.18 |

Indicators’ mean values in M1 and M2.

| | | | ||
---|---|---|---|---|---|

| | | | ||

| | 0.90 | 0.04 | 0.91 | 0.19 |

| | 84.77 | 1.29 | 75.41 | 3.09 |

| | 212.22 | 7.93 | 270.32 | 5.66 |

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This research was supported by the Research and Education Chair on “the Socio-Economics and Modeling of Urban Transit,” operated by Ecole des Ponts ParisTech (ENPC) in partnership with the Mobility Authority in the Paris area (IdFM, Île-de-France Mobilités, ex STIF), to whom the authors are grateful. The authors also thank the Autonomous Operator of Parisian Transit (RATP) who provided the train data.