How Do Individual Walk Lengths and Speeds, Together with Alighting Flow, Determine the Platform Egress Times of Train Users?

Egress times of railway passengers from train alighting up to station exit typically amount to some tens of seconds, but with much variability even at the train level. Here, we first model the egress time as the ratio of the walk length to the preferred walk speed, under free-flow conditions. )en, we model the possible occurrence of congestion among the users alighting from a train as a traffic bottleneck affecting those passing at a “queue focal point” during a “queued time interval.” Analytical formulas are provided for the CDF and PDF of egress times, covering the free-flow case and the congested case.)eir computation is straightforward for bivariate Gaussian length-speed walk pair. A maximum-likelihood method is developed, together with a quick estimation procedure. A case study of four contrasted trains serving an urban mass transit station in Paris is reported. One train experienced free-flow alighting conditions, whereas each of the other three had its own bottleneck. )e MLE method enabled us to recover all parameters but one, due to an issue of identifiability: the solution was to take the mean walk speed as exogenous.


Introduction
e rail mode is the best suited to the mass transit of passengers in big cities, as it can provide a high level of service to very large numbers of passengers (TCQSM, 2013). e busiest lines can flow up to 100,000 passengers per hour and per direction on their trunk links: this is achieved for instance by the RER A line in the Greater Paris Area (RER for Regional Express Railways), owing to a peak frequency of 30+ trains per hour, times large train capacities of about 3,000 passengers (using duplex trains about 210 m long).
Train passenger loads give rise to proportional flows of alighting and boarding passengers at the stations along the line. e boarding flows may experience specific congestion, notably so when some users are not able to board the first train that services the station just after their arrivals-thus being "left-behind" and having to wait for the next train [1]. User exposure to such boarding congestion can be mitigated by selecting one's waiting position along the platform: this position will give rise to the user's longitudinal position on board. Indeed, the train length size is purported to supply passenger capacity all along the train, supposedly in a homogenous way to limit crowding and make the best use of seats. e resulting spreading of passengers along the train will also exert some less direct consequences: it influences the length to walk in the station of alighting and in turn the platform egress time up to the station exit point. Furthermore, as the alighting passengers all egress from the train at about the same instant, specific congestion is likely to occur and to increase the platform egress times.
Up to now, the alighting traffic has been studied in two ways. First, in the perspective of network planning and passenger route choice, the egress times of individual passengers from the train alighting position to the station access point have been modeled in macroscopic models of traffic simulation, static or dynamic, as an average time exogenously specified for each station platform and train service (Cf. [2]). Second, in the perspective of railway operations, the flow volume and platform clearance times have been studied using microscopic simulation models of pedestrian traffic (e.g., [3]).
is article is focused on the platform egress times of train passengers. We are interested in the egress time as a physical variable involving both a space length to walk and a pedestrian speed. e physical variable is subjected to important variations among the train alighting passengers, due to the different alighting positions along the long platform as well as to the distribution of pedestrian speeds [4]. e article deals with the following three research questions. First, what are the influences of the on-board position and the pedestrian walking speed onto the passenger egress time? Second, when congestion occurs among the alighting passengers, what are its specific effects on their respective egress times? ird, what information on on-board positions and pedestrian speeds can be gained from the observation of platform egress times?
To answer the questions, we put forward a physical and stochastic model of platform egress times for transit users, which involves a statistical distribution of alighting positions and a statistical distribution of free-flow walking speeds. From these assumptions and the alighting flow volume, we derive the possibility of crowding and analyze its consequences on the individual egress times. Analytical properties are established for the statistical distribution of the egress times of a given train at a given station. By assuming either Gaussian or log-normal distributions for the alighting position and walk speed pair, closed-form formulas are established for the probability density function (PDF) of the egress time.
Turning to the issue of traffic observation, we apply the stochastic model to the estimation of train alighting positions and pedestrian walk speeds on the basis of egress time data collected mostly from smart cards (automated fare card or AFC system) together with train arrival times collected by an automated vehicle location (AVL) system. e PDF function is used to constitute a likelihood function according to observed egress times for a statistical population at the train level: by maximum-likelihood estimation, the parameters of the ex-ante distributions of alighting positions and walk speeds can be recovered. As an instance of application, we study the case of the Noisy-Champs station on the eastern part of the RER A line in Paris. e rest of the article is organized into six sections. Section 2 reviews the related academic literature. Section 3 introduces the physical and stochastic model. Section 4 provides some distributional assumptions and derives specific formulas to compute the PDF and CDF of the Gaussian and log-normal models. Section 5 develops the estimation methodology, from a simple scheme to maximum-likelihood estimation. en, Section 6 addresses the case study: after describing the traffic scene and the datasets, we provide the estimation results for four contrasted trains. Lastly, Section 7 concludes by stating the article's contribution and pointing to further developments.

Related Work
e topics of train length, station platform paths for train users, and passenger egress times have been dealt with for purposes of either train and platform design and the management of platform pedestrian traffic ( §2.1), of traffic modeling of pedestrian paths and egress times ( §2.2), or the stochastic modeling and statistical analysis of users' transit travel times ( §2.3).

Train Length: From Principle to Effects.
According to the Transit Capacity and Quality of Service Manual (TCQSM) [5], the passenger capacity of a railway line involves two factors in a multiplicative relation: line capacity and train capacity [6]. Line capacity is the maximum number of trains that can be operated on a line during a given period: it is typically measured in trains per hour and per track. Train capacity is the maximum number of passengers that can be accommodated with sufficient comfort on board a given train. e longer the train, the higher its passenger capacity: the respective capacities of all cars making up the train add up to the overall train capacity, in the same way as their respective lengths add up to the overall train lengths (up to that of connecting elements). Railway operators are accustomed to schedule single or double trains depending on the expected traffic load.
Not only does the train length enable it to carry a proportional number of passengers, but it is also convenient to provide a proportional number of doors to be used as channels for passengers boarding and alighting. e respective boarding and alighting throughput capacities depend on doorway width (TCQSM). ey add up along the doors and the cars to the overall train boarding and alighting capacities. e larger these capacities, the shorter the time required for train dwelling; thus, the larger the line capacity in trains and in turn in passengers [7].
To make the best use of the train capacities-on board, at boarding, and at alighting, it is desirable to split the passenger flows evenly along the train-both on board and on the platforms prior to boarding. In fact, each train making a particular run will have a particular passenger load at each station it will serve, both in volume and in longitudinal distribution. High volume together with density peak in that distribution gives rise to the critical door issue, that is, the door putting the highest requirement on the dwelling time.
is is why Liu et al. [8] modeled the number of waiting passengers at each of the 24 waiting positions on line 4 in the Pinganli metro station, Beijing. Using a multinomial logit discrete choice model, these authors postulated that the utility of a given position stems from the expected on-board density and the length from the station entry point to that position. Hoogendoorn et al. [9] further analyzed the effect of spatial densities on passengers' behaviors on the basis of a macroscopic fundamental diagram for pedestrian traffic. e heterogeneity of passenger loads and crowding conditions depending on train cars-that is, longitudinal positions-has motivated the design of traffic management schemes (TMSs) to spread the flow of users waiting along the platform in a shape adapted to that of the passenger distribution along the arriving train. Zhang et al. [10] tested the provision of crowding information to the users of a metro line in Stockholm, Sweden. eir results indicate that users can react to the crowding information and adapt their positioning strategy while waiting. Christoforou et al. [3] designed alternative information strategies and assessed their respective effects using a pedestrian microsimulation model. Related pedestrian TMSs for railway platforms include the choreography of alighting and boarding flows: while most platforms use the same platform and train side for both movements, using a dual-sided platform enables the operator to clear out the alighting flow more quickly, as their side is not impeded by boarding candidates, and reciprocally to start the boarding phase sooner and make it more fluent and quicker since it will not be impeded by the alighting flow. For a one-sided platform in a line metro transfer station in Santiago, Chile, Muñoz et al. [11] described the design and simulation assessment of a strategy compelling the users transferring there to come out of the first part of the train, by implementing a one-way gate in the platform width.
ere again the primary objective was to minimize the train dwelling time. e platform clearing time was also considered a complementary performance indicator.

On Passenger Paths and Walk Times in Traffic Models.
While the platform clearing time can be seen as a maximum time for passenger alighting and exiting the platform, it is of interest mainly to the line operator-as are the train critical door and the train dwell time. On an individual basis, the train users are more interested in their own egress times. ese times have long been modeled on an average basis in traffic assignment models to transit networks. Such traffic assignment models are especially purported to simulate individual users along their transit network paths between their origin and destination points [2]. Such path is composed as a sequence of nodes and links along the network, and it travel time is decomposed accordingly. In the first and second generations of transit traffic assignment models (from [12] to [13]) "in-vehicle links" typically go from one station to the next one along the transit line and there is one "in-vehicle node" per station and line direction to depict the dwelling operation; passenger paths involve such transit links for their in-vehicle rides, together with walk links for boarding, alighting, transfers, and station access and egress. By modeling the alighting path as one link, only the average egress time has been modeled. Walk lengths and speeds could be modeled as underlying random variables yielding distributed egress times in some kind of stochastic traffic assignment model, but to our knowledge, there has not been any such modeling attempt, neither for alighting nor for boarding.
In contrast, the influence of congestion onto the waitfor-boarding times has attracted several research contributions in the field of transit assignment modeling. In their macroscopic dynamic assignment models, Poon et al. [14] and Hamdouch and Lawphongpanich [15] addressed capacitated boarding as a traffic bottleneck under FIFO queuing discipline, yielding some delay when the boarding flow volume is in excess of the train residual capacity. e latter authors also suggested modeling the alighting flow in relation to some platform exit capacity, yet without providing an associated mathematical formulation. In both contributions, the platform lengths have not been considered explicitly. Longitudinal distribution has remained implicit in the macroscopic theory of dynamic transit traffic assignment up to Hänseler et al. [16] who introduced longitudinal detail of train platform and other "pedestrian elements" in a macroscopic framework that goes consistent from the station level to the line level and up to the network level. As their model deals with longitudinal positions of transit users in trains and on platforms in an endogenous way, the on-board positions as well as the alighting walk paths and the associated egress times are both distributed and endogenous. e model can also encompass different kinds of traffic congestion: as an instance, the authors considered macroscopic fundamental diagrams of pedestrian traffic to relate local walk speeds along the platform to the local pedestrian densities.
Such fine representation of platform issues in the frame of network traffic assignment bridges much of the gap between the previous generation of macroscopic models and the stream of dynamic microsimulation of transit traffic. At the platform level, Zhang et al. [17] devised a cellular automata microsimulation model of the alighting and boarding processes of passengers, revealing the potential mutual influences between passengers, such as the desire to board and pressure from behind. Haghani and Sarvi [18] used an error-component mixed logit model to analyze the differences of passenger's route choice between an emergency case and a base case. Ji et al. [19] studied the pedestrian choice between stairway and escalator in the transfer station by using a logit model, taking into account quantitative factors and nonquantitative factors. Christoforou et al. [3] modeled the Noisy-Champs railway station in eastern Paris using a crowd dynamics model so as to simulate the effects of passenger orientation strategies on the wait-for-boarding positions. ese microsimulation models deal with a specific traffic issue of pedestrian movement on platforms and capture the different influencing factors; they represent the accessing and exiting points, each waiting position, and its corresponding door through which passengers can board in or alight from the train. At the network level, the microscopic model "BusMezzo" of Cats [20] is purported to simulate the bus and train events and on-board crowding for all lines in a network, but it does not consider the length of train vehicles, nor the issue of multiple doors. Specialized microsimulation traffic models including VISSIM, Legion, and MassMotion have been developed on a commercial basis and enable for microscopic detail in both time and space, over a whole network.

Stochastic Models and Statistical Analysis.
us, there are some recently developed models of transit traffic simulation that consider spatial detail explicitly along platforms and trains. Assuming fine spatial description on both sides of macro-vs. microsimulation, a salient feature still differentiating macro-and micromodels pertains to the congestion model-based on either a macroscopic law or the dynamic simulation of interactions between entities such as Journal of Advanced Transportation passengers, vehicle elements, and platform elements. Stochastic modeling constitutes another bridge between microand macromodeling: in a stochastic traffic model, physical traffic variables such as length, speed, and time are modeled as random variables with specific distributions. Stochastic modeling therefore lays the ground for the statistical analysis of traffic data.
Stochastic modeling of transit paths was pioneered by Sun et al. [21]: following the path topological decomposition in network traffic assignment models, they analyzed the user time along a transit path as a four-tier sequence of (i) access, (ii) wait, (iii) ride, and (iv) egress. By postulating a specific statistical distribution for each tier time depending on its own physical conditions, the authors provided an estimation method for the parameters of all distributions. eir method was applied to a dataset of individual travel times observed between two validation gates (AFC records of tap-in and tap-out pairs), complemented by the related times of train arrival at and departure from the stations of access and egress (AVL data). Further on, Zhu et al. [4] modeled the access and egress times as the ratios of walk lengths divided by walk speeds, so as to estimate the distribution of pedestrian walk speeds in railway stations. A key element in their model is the passenger-to-train assignment probability. In a parallel work, Leurent et Xie [22] related the walk speed at the individual level on both sides of the ride (access and egress): they succeeded to estimate the length distributions together with the speed distribution owing to specific distributional assumptions of shifted exponential lengths together with uniformly distributed speeds. Gaussian distributed walk speeds were also considered in Xie and Leurent [23].
In this stream of passenger traffic stochastic modeling, boarding congestion was addressed by Zhu et al. [24] who modeled the left-behind phenomena as the failure-toboard one or more trains serving the station: the number of missed trains was modeled as a random variable composed at two levels. Leurent and Jasmin [25] provided a physical model of failure-to-board, postulating FIFO among the awaiting users. Hörcher et al. [26] addressed the influence of on-board crowding conditions onto the line choice of individual users under a specific subnetwork configuration: they focused on the passenger egress times first to assign different usage probabilities to the successive trains on each line, in a Bayesian way based on a postulated PDF for egress times, and then to differentiate between the two lines, again on the basis of Bayesian probabilities. is Bayesian approach involves the time of user exit and that of train departure to obtain the egress time conditionally to that train.
Up to now, no consideration has been paid in this stream to egress congestion or to traffic bottlenecks on either the boarding or alighting sides. Leurent and Xie [27] modeled the on-board positions in relation to both the platform entry point, in the access station, and, in the egress station, the platform exit point, together with an individual walk speed maintained on both platforms. is corresponds to free-flow walking conditions unaltered by any kind of congestion on the access and egress sides.
Overall, the stochastic modeling of individual egress times, possibly influenced by congestion in bottleneck form, with distributed walk lengths and speeds, is an original research topic. e notation table is as follows: (i) u: an individual user (ii) w: free-flow walk speed, with CDF W (iii) ℓ ′ : walk length of individual user from train alighting point to platform intermediary point (iv) ℓ e : length from platform intermediary point to station exit point (v) ℓ: individual walk length, with CDF S w conditionally to w (vi) τ: walk egress time along ℓ (vii) ℓ * : queue focal point (viii) [τ * 1 , τ * 2 ]: time interval of queuing at ℓ * (ix) v * : queue moving speed (x) t * : queued time from ℓ * to station exit point (xi) k ∈ 1, 2, 3 { } : index of user subset U k , respectively, before, during, and after queuing episode at ℓ * , with associated probability P k (xii) T k : CDF of effective egress time conditionally to k, where T denotes the unconditional CDF (xiii) A number of alighting users, for train-PIP-SAP triple

Platform Geometry and Walk
Lengths. Each station platform on a given railway line has its own geometry. As a spatial object of area type, it has a long, rectangular shape, mirroring the train lengths and the straightness of the infrastructure track. Its longitudinal dimension, typically in the range from 100m to 200m, gives rise to relative longitudinal positions on the platform and in turn on the trains that dwell there. By contrast, the platform width is relatively narrow, typically in the 5m-10-m interval: it is designed to accommodate the flows of alighting passengers and of incoming passengers that wait for train arrivals and constitute waiting stores, yet in a scarce way to spare the urban space. Let us define an intermediary point along the platform, say PIP for platform intermediary point, typically at the dwelling point of the train head (or tail) endpoint. e platform is endowed with its own points for pedestrian access and egress, each one with a specific longitudinal abscissa with respect to the origin point. ese points may be called pedestrian flow injectors, or platform funnels. Let us call them "platform egress points" (PEPs) to focus on the alighting flow.
Considering now the station, it has its own points of passenger access from, and egress to, the outer world: let us call them station access points (SAPs). As for SAPs, we typically consider a point equipped with ticket and card validation gates.
Each platform egress point is connected to one or several SAPs by way of a pedestrian path.
To an alighting passenger, the length to walk from train alighting to SAP, say ℓ, adds up that from PIP to SAP, say ℓ e , and the length ℓ ′ on the platform from the alighting position to the PIP. We shall denote this decomposition as ℓ � ℓ e + ℓ ′ . (1) As railway station platforms have long, narrow shapes, for any pair of points along the platform, we shall assimilate the walked length and the longitudinal difference in abscissas between the points (Figure 1). A typical PIP situation is at one endpoint of the platform. For a PIP situated at some intermediary point, the correspondence between ℓ and train alighting positions would be 1 : 2 instead of 1 : 1. For ease of discussion, we shall hereafter assume a PIP situation at a platform endpoint, so as to interpret ℓ ′ as the position along the train.

Free-Flow Pedestrian Speeds and Egress times.
Given the PEP and the SAP, the egress length ℓ still depends on the alighting position. As a walk length, it gives rise to the egress time of the train user from his train alighting point to the SAP, denoted by t. e factor linking ℓ to t is the walking pace or its inverse the walk speed denoted by w: notionally, is formula is an idealization. e time lag t 0 e accounts for specific delay such as on taking an escalator. As for the walk time τ ≡ ℓ/w, we take w to be about constant along the length: in other words, it is a cruising speed at the individual level. Such walk speeds are distributed among the transit users, according to physical condition, age, luggage, etc. (TCQSM, 2013). Under this interpretation, w would be a free-flow speed: postulating the individual user not to be impeded by other pedestrians.
Let us denote W the CDF of unimpeded pedestrian speeds for a statistical population of transit users and _ W its PDF. Denote similarly S and _ S the CDF and PDF of walk lengths, respectively. Further statistical description involves the stochastic dependencies between s and w as random variables. Conditionally to walk speed w, we shall denote S w as the CDF of walk lengths ℓ.
To sum up, under free-flowing, the walk egress time of an individual user is modeled as Let also h 0 denote the instant of train arrival to dwell on the platform, taken homogenous among the alighting flow. For every user u, the instant h u of passing the SAP and the egress time τ u are straightforwardly related as follows: In practice, some caution must be exerted to compare instants h u and h 0 . Assumedly, instants h u of user exit are measured at the validation gates, with respect to the station clock say, while instants h 0 of train arrivals are measured by the automated vehicle location system, say the train clock. Between the two-timing systems, there may be some time lag, especially so if the AVL time is measured at a fixed sensor located upstream the station. We might denote h 0 ′ ≡ h 0 − Δh 0 a corrected train arrival instant. Analogously, to focus on walk times, we would tend to decrease every h u by t 0 e . In the rest of the article, we take such corrections as given and we consider that τ u corresponds to the walk egress time τ ≡ ℓ/w. Under free-flow pedestrian traffic conditions, walk speed w is a free-flow one and τ has a statistical distribution (CDF T and PDF _ T) that stems from the joint distribution of walk lengths and walk speeds.
Let us express the free-flow distribution functions of egress times: is formula stands as the stochastic version of physical model (2). e PDF is obtained by partial derivation with respect to x:

On Pedestrian Traffic Conditions and Queuing
Phenomena. On exiting the train, the alighting passengers become pedestrians willing to get first to a PEP and then to an SAP. On their walk paths, they may be hindered by other pedestrians, be it due to conflicting directions or to different walk speeds and the inability to overtake a slower walker for lack of available width. ree kinds of conflicting directions may arise. First, between alighting passengers and boarding candidates just out of the train: under severe platform crowding, the width available in front of a train door for train passengers to alight may be very scarce, leading to a train exit bottleneck: here we do not address that severe kind of congestion.
Second, between alighting passengers and incoming pedestrians just arriving on the platform to board the train or cross the platform from one point to another: as the alighting phase is concentrated in time, while the incoming flow is relatively homogenous over time, the likelihood of such conflicts is small and we shall neglect their effects. e third kind of conflicting directions would arise among alighting passengers that would cross one another to go to different platform egress points. e potential outcomes will depend on whether the pedestrian density on the platform is low or high. Under low densities, such crossings are easy and the associated elemental delays are negligible. But under high densities, walking is slowed down and it would be very tedious for an individual pedestrian to manage a large number of directional conflicts. en, the following collective behavior is likely to arise: the alighting Journal of Advanced Transportation 5 flow "naturally" splits with respect to the closest egress points, so that each egress point will have its own "catchment area" along the sequence of train doors. In such a case, a targeted PEP will have a large number of egressing passengers and queuing is likely to occur among them. We shall model neither the light kinds of pedestrian traffic hindrance nor the sharpest kind of overcrowding when train alighting is delayed for lack of space. We only model two traffic regimes of either free-flow conditions at the individual level or a queuing episode upstream of the SAP from some focal point ℓ * .
As will be reported in the case study, alighting passengers targeting a given PEP, when their number is high, will constitute a pedestrian queue affecting all of its members. We shall model that kind of queue as a traffic bottleneck.
Here are the main assumptions: (A1) ere is some focal point ℓ * upstream the SAP, at which the queue beings at instant h * 1 soon after the instant h 0 of train arrival at the station. To instant, h * 1 corresponds an egress time τ * Users involved in it, after passing at ℓ * , will walk from ℓ * to SAP (will null length) at queued speed v * , spending time t * � ℓ * /v * . eir set is denoted as U 3 . e first and last egress times of them are τ 1 ≡ τ * 1 + t * and τ 2 ≡ τ * 2 + t * . (A3) Users unaffected by queuing are of two kinds, depending on whether they pass the SAP before eir respective sets are denoted U 1 and U 2 , respectively. (A4) Every un-queued user of the first kind U 1 has freeflow walk speed w from their alighting position ℓ up to SAP, hence free-flow egress time τ � ℓ/w. e (ℓ, w) pair also satisfies that ℓ − ℓ * ≤ w.τ * 1 , that is, either they alight downstream ℓ * if ℓ〈ℓ * { } or upstream it but before the beginning of queuing. us, (A5) Every un-queued user of the second kind U 2 has free-flow walk speed w from their alighting position ℓ up to SAP, hence free-flow egress time τ � ℓ/w. e (ℓ, w) pair also satisfies that ℓ − ℓ * 〉w.τ * 2 ; that is, they pass at ℓ * after the queue vanished from it. us, Consequently, the set of users affected by the queue amounts to the complementary set: (5c) It holds that U 3 ⊂ U 3 , which also contains users passing at ℓ * before τ * 1 or after τ * 2 but who join the queue at some point from ℓ * to SAP due to their own free-flow speed in relation to v * .

e Effective Distribution of Platform Egress Times.
For a train and a PEP giving rise to a queuing episode, within the statistical population of train-alighting users, the proportion of the three user groups is, respectively, Users in U 1 ⋃ U 2 enjoy free-flow egress times τ � ℓ/w. From the definition of their sets and the non-negativity of walk speeds, we have that In en, users with τ ∈ [τ 1 , τ 2 ] belong to U 3 . Among that set, the users are subjected to some queuing effect. We further assume the following: (A6) Most of the queued users are involved in a traffic bottleneck originating at ℓ * , and the bottleneck exit flow rate is about constant (at some capacity value).
en, among those queued users the egress time up to ℓ * is distributed uniformly from τ * 1 to τ * 2 . In turn, as queued walk time t * is assumed from ℓ * to SAP, the egress time from alighting to SAP will be distributed uniformly from τ 1 to τ 2 .
is applies strictly to the queued users in U 3 , and we further assume that: (A7) It extends to all users involved in U 3 . us, among U 3 , the CDF of egress time is where Δτ ≡ τ 2 − τ 1 . By partial differentiation, the associated PDF is Δτ . (8b) For users in U 1 , the egress times are distributed with the following conditional CDF: As P 1 does not depend on x , we get the associated PDF by straightforward differentiation: Let us distinguish two cases depending on whether x ≤ τ * 1 or x 〉τ * 1 . In the former case, as walk speeds are positive and so is ℓ * ; then, the condition w · x ≤ ℓ * + w · τ * 1 holds true for every w ≥ 0. In the latter case, the condition is (10b) Analogously, for users in U * 2 : the CDF from above, As P 2 does not depend on x , we get the associated PDF by straightforward differentiation: (11b) We are now able to express the overall CDF and PDF, denoted as T U (x) and _ T U (x), respectively, by combining the conditional distributions according to the three cases: , and analogously, Substituting, we obtain that (12b)

An Incomplete Congestion Model.
In a preliminary approach, we defined the sets of users egressing before or after the queuing episode on the basis of incomplete conditions as follows: Journal of Advanced Transportation Both definitions make no reference to focal point ℓ * . ey are consistent with the definition of U 3 , which is equivalent to As On the other side, U 2 � ℓ 〉w · τ 2 ∩ ℓ 〉ℓ * + w · τ * 2 imposes a stronger condition on lengths ℓ than does U 2 : it may occur that ℓ * + w.τ * 2 〉w.τ 2 when w(τ 2 − τ * 2 )〈ℓ * , that is, when w.t * 〈ℓ * hence when w〈v * . So, users in U 2 but with ℓ * + w.τ * 2 〉ℓ ≥ w.τ 2 and w〈v * do not belong to U 2 . In the incomplete congestion model, we would have the following: It is easy to compare the incomplete congested PDF to the free-flow one in (4b): as , that is, out of the queued interval, on which the free-flow density M x (0) is replaced by its queued counterpart P 3 /Δτ. Between the incomplete and full congested models, the respective PDFs have similar queued parts up to the definition of P 3 instead of P 3 , whereas the free-flowing parts are subjected to subdomain restriction in the full model that considers only those speeds above ℓ * /(x − τ * k ). ese differences vanish when ℓ * � 0 { }: the incomplete congestion model can be seen as a restricted version of the full congestion model such that 3.6. Capacity Issues. Let us denote A as the total number of alighting passengers, for that train and PEP and SAP. en, there are A.P 3 queued users that exit in time length Δτ. Defining the flow rate capacity for that train, K A , it holds that In the incomplete congestion model where queuing originates at point ℓ * � 0 { }, we would expect the flowing regime to change from free-flow to queued at τ 1 in a smooth way, yielding equivalent flow rate on both sides of τ 1 between A. _ T U (τ 1 ) from below and K A from above. is condition is equivalent to Since Under the bottleneck postulate, formula (16) therefore constitutes a characteristic condition associated with ℓ * � 0 { }. By contrast, a significant flow rate discontinuity at time τ 1 indicates that ℓ * 〉0 { }. Figure 2 depicts the statistical distribution of egress times either free-flow or including a queuing episode at ℓ * � 0. e free-flow distribution corresponds to users' arrivals at the potential bottleneck, whereas the effective one corresponds to their departures from the potential bottleneck.

Distributional Assumptions
Basically, free-flow egress time τ is modeled as the ratio of space length l and cruising walk speed w. At first glance, statistical independence between l and w looks a reasonable assumption. On second thoughts, however, there may be correlation between them: for instance, hurried train users would both walk faster and position themselves on board so as to alight closer to their platform egress point.
Let us recall the free-flow CDF and PDF of egress times in (4a) and (4b): In (4a) and (4b), we expressed the free-flow CDF and PDF of egress times. Let us now put forward specific distributional assumptions of two kinds: either a bivariate Gaussian distribution for (ℓ, w) or a bivariate Gaussian distribution for (ℓ, w), where ℓ ≡ ln(ℓ) and w ≡ ln(w), called the log-normal model.

Basic Definitions and Free-Flow Properties.
We may assume Gaussian distributions for lengths and speeds: or more precisely that the (ℓ, w) pair is a bivariate Gaussian vector with ℓ ≈ N(m ℓ , s 2 ℓ ), w ≈ N(m w , s 2 w ) and χ � cov(ℓ, w). Given x, the spacings around the coma are too large x ) so that the free-flow CDF in (4a) takes on the specific form as follows: where Φ denotes the CDF of the reduced Gaussian variable. e associated reduced PDF will be denoted as Journal of Advanced Transportation e resulting distribution of τ is not Gaussian, because the influence of x on the CDF (both through x and y x ) is a complex one.
By straightforward derivation, _ . us, we get the free-flow PDF of the egress times as Both (18a) and (18b) are easy to compute. e complex influence of x is obvious in (18b), both through ϕ and out of it as a quotient of functions such that the denominator involves an exponentiation to power 3/2.
Of course, the Gaussian postulate is somewhat farfetched for walk speeds as it gives support to some negative values: in each model estimation, we will have to check ex-post that the estimated parameters give rise to "almost certain" positive speeds.

Properties for Trains with Alighting
Queues. It is shown in the appendix that the law of ℓ conditionally to w is . ese formulas enable us to calculate the PDF of egress times in the congested model.
As for the subset probabilities P k , we have to calculate P 1 and P 2 numerically: . e computation needs be done once for each set of parameters.
In the incomplete model, the probabilities P 1 and P 2 are easy to calculate: . A similar approach yields fairly good approximations for P 1 and P 2

Log-Normal Model.
When two positive real variables are involved in a product or quotient relationship, it is convenient to model them as bivariate log-normal, because the composed variable will be log-normal, too. We shall indent with tildes the variables to denote their natural logarithms concisely: let then ℓ ≡ ln(ℓ), w ≡ ln(w), and τ ≡ ln(τ). As for parameterization, let ℓ ≈ N(μ ℓ , σ 2 ℓ ), w ≈ N(μ w , σ 2 w ) and ξ � cov(ℓ, w).
As ln(1/w) � − w, the log-egress time τ satisfies that  Journal of Advanced Transportation us, the egress time is distributed log-normal with parameters μ τ and σ 2 τ . Its CDF and PDF are, respectively:

Estimation Methodology at the Train Level
We shall first establish some basic properties and provide a simple, empirical estimation method to recover alighting positions from free-flow egress times, taking as exogenous the distribution of free-flow walk speeds. en, we devise a "train likelihood function" of observed egress times, enabling for the maximum-likelihood estimation of model parameters: queue focal point ℓ * and time bounds τ * 1 and τ * 2 , queued walk speed v * , and the capacity flow rate K * , as well as the parameters in the joint distribution of the free-flow walk speeds and walk lengths.

Simple Properties for Independent Lengths and Speeds under Free
Flow. Let us establish some properties for freeflow egress times under the simplifying assumption of statistical independence between lengths and speeds.

Signal Analysis.
Let us focus on the influence of length ℓ onto egress time τ � ℓ/w: we take this influence as the "signal," as opposed to the influence of walk speed w, which is taken as the "noise." Here, we want to assess the importance of the signal in the phenomenon and measure the ratio between the signal and the noise. To do that, we shall decompose the variance V[τ] with respect to V[ℓ] and V[w − 1 ]. Postulating here the independence of ℓ and w, it e latter decomposition, after division by E[τ] 2 , gives the following relationship between the squared relative dispersions: As first-guess assumptions on the lengths and speeds, let us take the following: e signal share is c 2 ℓ /c 2 τ � (1/3)/(5/12) � 4/5 � 80%, and the signal-to-noise ratio is c 2 ℓ /(c 2 τ − c 2 ℓ ) � 80%/20% � 4. is quick numerical application encourages us to look for the influence of the length signal in the egress time phenomenon, and conversely to utilize observed egress times to infer the associated lengths and the alighting positions behind them.

A Quick Estimation
Similarly, from the variance formula of the quotient variable, the variance of alighting positions can be recovered as is estimation procedure is particularly straightforward for log-normal variables. In this case, from E[τ] and V[τ], we obtain c 2 τ � V[τ]/E[τ] 2 and the log-normal parameters: variance of log-egress times is σ 2 τ � ln(1 + c 2 τ ) and average log-egress time is μ τ � ln(E[τ]) − (1/2)σ 2 τ . If the exogenous distribution of speeds is log-normal, too, then we similarly get σ 2 w and μ w . Next, in the independent case, the log length is distributed Gaussian with mean μ ℓ � μ τ + μ w and variance σ 2 ℓ � σ 2 τ − σ 2 w (the minus sign comes from (19)). It remains to derive first the squared relative dispersion c 2 ℓ � exp(σ 2 ℓ ) − 1, then the average E[ℓ] � exp(μ ℓ + (1/2)σ 2 ℓ ), and lastly the variance ℓ . e quick estimation scheme pertains to alighting positions on the basis of a prior knowledge of the distribution of walk speeds. Furthermore, it is based on the assumption of independence between lengths and speeds, and it is restricted to free-flow egress times.

e Likelihood Function of a Train Sample of Egress times.
Let us consider the sample of all users alighting from a train at a given station; its size A is the number of alighting users. We index the users with u ∈ 1, 2 . . . A { } in the order of increasing egress times τ u : thus, our observation dataset is O � τ u : u ∈ 1, 2 . . . A { } , and it is an exhaustive sample for that train.

Free-Flow Case.
e users that egress under free-flow conditions can be considered independent of the other ones. When all of the alighting users enjoy free-flow, their egress times contain no information about any queuing episode. e set of parameters that can be recovered by statistical estimation then pertains to the joint distribution of walk speeds and lengths, say Θ FF ≡ Θ (ℓ,w) . e PDF of any observed egress time, _ T(τ u | Θ FF ), contains information on Θ FF and constitutes an elementary likelihood function L u (Θ FF | τ u ), with associated log-likelihood function Λ u (Θ | τ u ) ≡ lnL u (Θ | τ u ) � ln _ T(τ u | Θ). As free-flowing egress times are statistically independent, the train sample gives rise to a train likelihood function under product form: with associated log-likelihood function under additive form:

Train with Queuing Episode.
When queuing occurs among the users alighting of the train, we expect that the egress times of queued users contain some information on the queuing parameters, . en, the overall vector of parameters that may be estimated is All users egressing under free-flow pedestrian conditions can still be taken as mutually independent and independent from the other ones, so that their joint likelihood function is under product form.
It remains to state the likelihood function of queued users. Statistical independence between them is not obvious since some FIFO rule applies within the queue. However, we will also take _ T(τ u | Θ) as a likelihood function of Θ and assume that the joint likelihood of queued users is under product form, and further that the queued users are independent of the free-flowing ones. en, overall, And the associated log-likelihood function is under additive form: Given Θ Q , hence τ 1 and τ 2 , the set of alighting users, O, can be split into three subsets O 1 � τ u 〈τ 1 , O 2 � τ u 〉τ 2 , and O 3 � τ u ∈ [τ 1 , τ 2 ] , with respective sizes A k that add up to A. From the formula of _ T U , the log-likelihood of the queued users amounts to While for those users in O 1 and O 2 , we have that, respectively, In fact, the involvement in the queue tends to erase the information on Θ (ℓ,w) in τ u , analogously to the absence of information on Θ Q in the egress times of free-flowing users. If the queuing episode is long and involves a vast majority of alighting users, then we expect the sample to carry information mostly on Θ Q but little if any on Θ (ℓ,w) . us, a wise estimation strategy could be to set up Θ (ℓ,w) on the basis of prior knowledge and to focus on Θ Q as the "active" set of parameters for that train.

Maximum-Likelihood Estimation.
Maximum-likelihood estimation is a fundamental method for statistical estimation, owing to both theoretical properties and tractability. Given a sample of observations, it consists in maximizing numerically the likelihood function associated with the sample (or, more conveniently, its logarithm called the loglikelihood function), with respect to the vector of parameters.
e train log-likelihood function is quite tractable for standard optimization algorithms. Yet some caution is in order about discontinuities in the function: changing the queuing parameters may change the assignment of observed times from free-flow regime to queuing and reversely, thereby changing the associated elementary log-likelihood function from one specification to another one, at the risk of discontinuity.

MLE of the Model with Gaussian
Components. In the model with Gaussian lengths and speeds, both the CDF and PDF of τ, and in turn L u in the free-flow case as well as in the queued case, depend on the five parameters: m w , s 2 w , m ℓ , s 2 ℓ , and χ. ese are involved together starting from relationship (4a); that is, ℓ e ≤ w · Δx . As this relation puts ℓ and w on the same level, in a linear way, it will make the (ℓ, w) joint distribution identifiable only up to some scale factor that will affect the mean parameters at order 1 and the variancecovariance parameters at order 2. ese influences are easy to trace out in both CDF and PDF formulas (18a) and (18b).
In turn, the system of five first-order optimality conditions for likelihood maximization with respect to the five parameters will be underdetermined: only four out of five parameters may be identified. Our intuition here is to take the average walk speed as given and to restrict the application of MLE to the other four parameters.

MLE of the Log-Normal Model under Free Flow.
In the absence of queuing, the bivariate log-normal specification of pair (ℓ, w) yields a simple log-normal model of the egress time τ � ℓ/w . e associated pair of logs, (ℓ, w), is bivariate normal with μ ℓ , σ 2 ℓ , μ w , σ 2 w , and ξ as parameters. en, the log-egress time τ � ℓ − w is normal with mean μ τ � μ ℓ − μ w and variance σ 2 τ � σ 2 w + σ 2 ℓ − 2ξ. In the application of MLE to a sample of free-flow egress times τ u hence of τ u , only two parameters μ τ and σ 2 τ are identifiable. e optimality conditions of the MLE in that case are well known: At that stage, the line of attack in the quick estimation procedure is appropriate: it is a wise strategy to take μ w and σ 2 w as given and to focus on the estimation of walk length parameters, namely, μ ℓ and composite parameter σ 2 ℓ − 2ξ, as the respective influences of σ 2 ℓ and ξ would be hard to disentangle.

Site Location and Platform
Geometry. Line A of the Regional Express Railways (RER) is the busiest urban rail transit line in the Paris region "Ile-de-France" and maybe Europe, carrying more than 1 M passengers on every workday. e line is serviced by duplex trains about 210 m long, each with 10 cars (and per car 3 doors each 2m wide) and a nominal capacity of 2,800 passengers (assuming 4 p/ m 2 of standing space). Along that line, the Noisy-Champs station (48°50′34.55″ N, 02°34′55.06″ E) is located in the eastern part of the Paris conurbation on the edge of the Noisy-le-Grand and Champs-sur-Marne communes. Its attendance was of 4.4 million travelers in year 2015. On weekdays, there are significant flows of travelers, especially workers and students coming to work in the "Cité Descartes," a district of offices, high schools, and activity parks. Figure (3(a)) shows the outline of the station and the facilities around the Noisy-Champs train station. To the west of the station (Noisy-le-Grand side) is a residential area and to the east (Champs-sur-Marne side), there are offices and residences as well as the Cité Descartes of many high schools and universities. On peak periods of weekdays, important flows of passengers arrive to, and exit from, the east side of the station, which provides direct access to park and ride facilities and to bus stops.
Regarding the shape of the station, the platform extends over 230 m along an east-west axis on each side of the two rail tracks: each side has a width of about 5 meters. e northern platform is utilized by train runs in the direction from Paris to Marne-la-Vallée, whereas the southern platform serves for the inverse direction. Per platform side, there are two PEPs (platform egress points) at the east and west endpoints. On the northern platform, there are stairs and escalators that connect the PEPs to the SAPs (station access points) with validation gates for tap-in and tap-out. On the southern platform, there are only stairs to connect the PEPs to the SAPs.
is study is focused on the northern platform and more specifically the station access point to the Cité Descartes. From the geographical situation of the station within the metropolitan area, employment, and activities are relatively scarce eastwards compared to those westwards and even to local opportunities (jobs and schools): thus, on the northern platform (from Paris to Marne-la-Vallée EuroDisneyland), the boarding flow is much lower than the alighting flow and the alighting users are not impeded by the boarding candidates. Figure 3(b) details the geometry of the eastern station access point. e platform is located on level -1 and the validation gates on level 0. ere is an escalator for one or two people abreast and a parallel staircase about 2.5 meters wide. As the two vertical elements are parallel and of limited height (about 4 meters), their respective pedestrian times are quite similar (from field observation, also in accordance with Wardrop's 1 st principle): this is why no distinction will be made between them. When a train arrives and disembarks the alighting flow, passengers destined to the City Descartes use the exit and a traffic jam may arise on the platform in front of the escalator (Figure 3(d)). After the travelers climbed the escalator or the staircase, the validation gates are about five meters at the bottom. e distance from train head to gates is from 25 to 30 meters depending on the validation gates. ere are three entrance gates and six exit gates. During peak hours passengers who took the escalators and stairs may queue in front of the six exit gates. e different positions of those validation gates may induce further difference in the egress times.

AFC and AVL Datasets.
We made use of two datasets obtained from two systems of automated fare collection (AFC) and automatic vehicle location (AVL), respectively. Both datasets were constituted for all weekdays from the 16th to the 29th of March 2015, that is, 10 days in total. e time stretch enabled us to pinpoint peak periods of weekdays unaffected by disturbances.
In the Paris region, the AFC information system (named SIDV) records all validations at fare gates in stations for rail modes or on board for buses and trams. Every gate has a particular index, and every card has one unique number.
us, per validation, the AFC record contains spatialtemporal attributes of line, station, gate, time, user, etc. Our AFC dataset contains 4,675,672 validations along line A, including 83,740 validations in Noisy-Champs; it pertains to a total of 723,185 travelers, among whom 21,822 passing by Noisy-Champs.
e AVL system uses track circuits to detect events of train passage at given points: the resulting train timestamp (geolocation and instant) is transmitted by radio. ree kinds of train and track events are monitored: either outside stations or in them-the ARR and DEP types. ARR and DEP provide the exact instants of arrival and of departure of trains in stations. Our AVL dataset records 6,608 train runs on the RER A line, among which 2,954 stop by Noisy-Champs.
Based on the within-day variations of validations at Noisy-Champs, five daily subperiods were identified: morning peak (

Empirical Evidence from Field Survey.
We also designed a specific field survey, which was carried out by four engineering students on Friday 25/09/2020 from 8 a.m. to 9 a.m. e purpose was to measure passenger egress times according to different train cars, to describe congestion events and understand their mechanisms. A specific fourfold protocol was set up as follows. e first part consisted in observing the distribution of egress times. Students were posted near to the validation gates in order to count the number of passengers tapping out at each second after the opening of doors. Four trains were observed, yielding a sample of 405 individual times. e average headway of these 4 trains is 4.1 minutes. Among the observed egress times, from the shortest value of 19 s to the longest one of 242 s, the mode was 60 s, the average 73.9 s, and standard deviation 35 s (see Figure 4). e second part consisted in following randomly selected alighting passengers from their car to the validation gates so as to measure their walk time. e results were consistent with the first part. e third part was devoted to counting the number of alighting passengers passing by a specific point along the platform, namely, between cars 4 and 5 from train head (out of ten). e fourth and last part focused on queuing phenomena. Two queues were observed: one at the foot of the escalator (PEP) and the other in front of the gates (SAP). At the PEP, queuing occurred around 15s from train arrival and lasted 32 s on average (among the trains). At the gates, some queuing occurred around 50s from train arrival [28].

An Ex-Ante Convention to Identify Queuing Time
Ranges. Based on both field observation and the analysis of AFC and AVL data, we estimated on a provisional basis an exit flow capacity at validation gates of about 10 people in 5 seconds, that is, 2 people per second. Once the flow rate reaches this threshold, the queuing phenomenon appears. All the gates are used and waiting lines appear at both the gates and the escalator. en, based on provisional exit flow capacity, a "provisional queuing interval" was determined in the following way: from AFC data, the entire egress period was sliced in 5s sub-intervals and we took as "initial queuing instant" the start of the first 5s slice containing 10+ and as "final queuing instant" the end of the last 5s slice with 10+ validations. From these educated guesses, we derived times τ 1 , τ 2 , τ 2 − τ 1 (duration of congestion) for every train in the AVL and AFC datasets. e resulting distribution of τ 1 has mean of 57.6 s and standard deviation of 8.7s. at of τ 2 has a  Figure 5(a), exhibits one primary statistical mode around 60 s and a second, minor mode around 100 s, with an associated subpopulation probability of about 15% (from the .007 density level times the 80-110 s range, minus the tail of the distribution associated with the primary mode). Such secondary mode involving about 10 passengers may correspond to a specific group, for instance a bunch of students coming back to their residences. To circumvent the randomness of such events, we decided to focus on the primary mode and subpopulation in the following way. All egress times up to threshold τ ≡ 80s are assumed to belong to the primary subpopulation, and their individual values are kept in the sample: there are A ′ ≡ 46 of them. As for values above τ, only one-third of them is taken to belong to the primary subpopulation: their number is A ≡ (A − A ′ )/3. While this number is known, we do not consider the individual values and keep to the information that these values are greater than τ. e associated log-likelihood amounts to A · ln[1 − T(τ | Θ)].
en, the total log-likelihood function of the primary sample amounts to Looking for parameter vector Θ  sample strongly supports the assumption of statistical independence between walk lengths and speeds among the alighting passengers. Based on the estimated values together with the independence property, we can recover some properties of the egress time RV and analyze the signal-to-noise ratio (cf. §5.1.1). As c w ≈ 25%, we can safely approximate Let us now comment on the estimates in the FC model. at of the standard deviation of walk speeds is fairly standard. About lengths, estimate m ℓ of 102-m minus shift s e of 30-m yields an average alighting position of ℓ ′ ≡ m ℓ − s e � 72m, which corresponds to the medium tier of the fourth car along the train. en, taking ℓ ′ ± 2s ℓ as a quick estimate for a 95% confidence interval of alighting positions, the resulting interval [41m, 103m] corresponds broadly to cars 3 to 5 out of the 10 car train. is may correspond to the first two cars being occupied mostly by other passengers destined downstream Noisy-Champs, as well as to some relocation behavior before boarding the train by Noisy-Champs users in order to benefit from better on board comfort (more available seats and less crowded standing spaces). Coming to the queuing characteristics, the queue focal point at 4 meters from the validation gates corresponds to the exit point of the vertical element (escalator and stairways), while queuing would arise at its entry point according to our field survey: this difference may be linked to the fact that people have little if any possibility to overtake one another on the vertical element, therefore making its exit point a replica of its entry point. e queue moving speed of .92 m/s is about one-fourth less than the average speed, which looks consistent.
By depicting the CDFs of the travel time distributions in the FF, IC, and FC models, respectively (in the right-hand part of Figure 6), a typical bottleneck pattern arises: the freeflow CDF mimics the cumulated flow of user arrivals in a bottleneck, whereas the IC and FC CDFs each mimics a cumulated flow of bottleneck exits. From the PDF curves (on the left-hand side of the figure), it is obvious that the congested models are much better fits to the empirical observations than the free-flow one. From either congested model, it is possible to estimate exit capacity based on the train data rather than on the provisional basis. From the FC model, the bottleneck flow rate amounts to A × .015s − 1 � 2.94 people per second.
From the free flow to the full congestion estimates of Θ (ℓ,w) , the FC length parameters correspond to alighting positions closer to train head (by 6m on average) and more concentrated (SD reduced from 20.2 m to 15.6 m). Both walk speed distributions comply to the same exogenous average of 1.2 m/s; under null covariance, the estimated SD of FC is 10% higher than the FF one, but the FF estimation with nonzero covariance yields the same speed SD estimate as the FC. is suggests that nonzero covariance in the FF model may point out to the occurrence of queuing. Coming to the incomplete congestion model, , with a log-likelihood value of − 738.638. Again, the estimate of Θ (ℓ,w) is very close to those in the FF and IC models. e nonzero value of covariance χ is close to zero, and it may be neglected with no loss of statistical significance (log-likelihood level at -738.77). As for queuing parameters, the queuing focal point is located at 3.2 m from the validation gates and the average queue moving speed of 0.798 m/s is both plausible and consistent with the estimation for the 18 : 59 train.
Applying the ex-post analysis of §5.1.1 to the FC outcomes, the distribution of w − 1 has a relative dispersion of 23% and a mean of 0.88 s/m. en, the underlying distribution of free-flow egress times has a relative dispersion of 0.32 and mean of 84s and SD of 26 Overall, the 18 : 32 train yields a large alighting flow giving rise to some queuing, in a lighter way than the 18 : 59 train. Modeling the light queuing by the IC and FC models, rather than neglecting it by keeping to the FF model, provides an improvement of 2 or 3 points in log-likelihood. is falls in a gray area between "little significance" and "marked significance." e length parameters are consistent with our previous findings, whereas the speed SD is higher by onefourth. e non-null though small estimate of the covariance parameter suggests the occurrence of congestion, which is further substantiated by the empirical PDF profile in Figure 8(a).
We   gates, and the queue moving speed is close to 0.9 m/s. e queuing interval of [56 s, 87 s] is much shorter than under our provisional convention. As for the walking parameters, the length parameters are consistent with our previous findings both for that train and the other ones; the speed SD is similar to its FF and IC counterparts. Applying the ex-post analysis in §5.1.1 to the FC outcomes, the underlying free-flow distribution of w − 1 has a relative dispersion of 29% and a mean of 0.90 s/m. en, the underlying distribution of free-flow egress times has relative dispersion of 0.38 and mean of 92.25s and SD of 35s. Length variations contribute to 39% of egress time variations, yielding signal-to-noise of 33%-that is, speed variations would be twice more influent than length variations in the variations of free-flow egress times.

Synthesis.
We presented the estimation of the three traffic models FF/IC/FC for four trains with different levels of alighting flow yet taken from the same time period-half an hour in the evening peak on a given working day. e presentation was ordered so as to demonstrate first the FF model on the basis of the 18 : 35 train and then the congested models using the 18 : 59 train for which it is most advantageous to model congestion in an explicit way. ese two trains constitute the extreme points of a range, which encompasses the two other trains at 18 : 32 and 18 : 46.
On MLE computation. For every train, we applied the traffic model in a progressive way, from FF to IC and then to FC.
is enabled us to compare the resulting estimates for that train and to characterize the queuing phenomenon progressively. e application of MLE to the FF model is easy: using the standard optimization algorithms provided in Excel as well as in Python libraries, the convergence was straightforward. Also endowed with straightforward convergence is the application of MLE to the incomplete congested model under exogenous queuing parameters-the ex-ante determination of the [τ 1 , τ 2 ] interval under the educated guess of a 2p/s exit capacity. But the endogenization of the queuing parameters introduces discontinuities in the log-likelihood function, thereby making the numerical optimization a much more demanding task. In practice, using an Excel spreadsheet for each train, we resorted to a heuristic alternation of automated search (using the Excel solver) and manual adjustment to get to the "optimal points," where local optimization was obtained. We got satisfied with the resulting estimations because they induced fairly good reproduction of the PDF and CDF profiles of the empirical distributions of the egress times.
At the train level, the model estimation enables us to recover the underlying distribution of walking lengths and speeds, under preset mean speed to ensure identifiability. e estimates of walk speed SD range from 0.28 to 0.38 m/s, with some interplay with the covariance parameter. From the free-flow model applied to the fluid train at 18 : 35, the statistical independence between walk speeds and lengths is strongly supported. Conversely, a nonzero covariance estimate would capture some part of the queuing phenomenon. is may be useful when applying the free-flow model in order to detect the occurrence of congestion-therefore calling for the application of congested models.
As for the distribution of walk lengths, the mean and SD estimates of the congested models are consistent at the train level: they differ from the corresponding free-flow estimates by a reduction in both the mean and SD, meaning that neglecting the queuing phenomenon by applying only the FF model will lead to biased results. e estimates of the mean length parameters seem to vary in a systematic way depending on the train: larger alighting flow comes along with larger mean length, meaning larger alighting positions from the train head in the  platform configuration at Noisy-Champs. is suggests that on-board positions are influenced both by the exit conditions and the passenger load conditions along the train.
Coming to the queuing phenomenon, the bottleneck behavior has been evidenced by three out of the four trains in the half hour period under study. e time range of queuing depends on the train: the larger alighting flow decreases the queue beginning time and increases its end time. While empirical data exhibit significant instantaneous variations in the exit flow rate, the bottleneck postulate enables us to identify the average exit capacity in a straightforward way. From the two most congested trains, this capacity falls in the range of [2.5, 3] p/s; that is, it is much higher than our exante convention of 2 p/s. e notion of queuing focal point ℓ * is supported by the estimation results of the FC model. e estimated positions at about 4m from the validation gates correspond to the exit point of the vertical element, which combines an escalator and a stairway. e queue moving speed v * was estimated consistently between the three congested trains, around 0.9 m/s, that is, one-fourth less than the mean free-flow walk speed. e queue focal point and the queue moving speed are strongly complementary parameters: their ratio t * � ℓ * /v * is a propagation time to transport the queuing time range from ℓ * to length 0.

Summary
Physical eory and Stochastic Model. Individual egress times from train alighting to station exit constitute a statistical population at the train level, with much variability across the individuals. We modeled the magnitude and variations of egress times as a random variable and captured its dependencies onto underlying factors of (i) walk length, (ii) pedestrian speed, (iii) and possibly congestion among the alighting passengers (in the form of a traffic bottleneck). As the train is long, the alighting positions are stretched out over it, giving rise to distributed walk lengths. Also distributed among the individual users are the walk speeds in the "free-flow" regime. In turn, so is the egress time under free flow (FF). We provided a stochastic model of FF egress times, with explicit analytical formulas for its CDF and PDF depending on the joint distribution of walk lengths and FF speeds. As for congestion, we modeled it in the form of a traffic bottleneck based on a queue focal point ℓ * , a congested interval [τ * 1 , τ * 2 ] at that point and a queue moving speed v * up to the station exit. Decomposing the train population of egress times depending on whether the user would pass at ℓ * before, after, or during the queued interval, an explicit analytical formula was obtained for the PDF of the egress time in the so-called full congestion (FC) model. us, both the FF and FC models are endowed with analytical formulas. Between them, an intermediary model called "incomplete congestion" (IC) assimilates the queue focal point to the station egress point.
By postulating a bivariate Gaussian distribution for the walk pairs of length and FF speed, straightforward computable formulas are available for the CDF and PDF of the FF egress time and for the PDF of the IC/FC egress time.

Estimation Methodology.
e physical and stochastic model of egress times pertains to a triple of train, platform exit point and station access point, since the distribution of walk lengths depends on the positions on board the train of the alighting users, as well as on the walk pathway topology. Under free flow, we devised a simple estimation method to recover the length distribution from that of egress times, under exogenous distribution of walk speeds. More generally, a maximumlikelihood estimation (MLE) method was devised on the  basis of the PDF and CDF formulas to construct the likelihood function of the model parameters depending on an observed egress time.
Case Study. e model and its MLE method were applied to the train station "Noisy-Champs" in Paris. e four trains serving the eastwards platform during half an hour on the evening peak of a typical workday were studied on an individual basis. As their respective alighting flows are contrasted, one gives rise to free-flow conditions, whereas the other three experience a queuing episode. e datasets of egress times were constituted from AFC records of passenger station exit times, related to AVL records of train platform arrival times. Estimation results were reported and commented, as well as the MLE applicability.

Outreach, Limitations, and Further
Research. e model is sensitive to train characteristics: notably, the probability distribution of the alighting positions and also the alighting flow volume. It is also sensitive to platform and station features, through the distribution of walk lengths as well the walk pathway and its width available to pedestrians. FF walk speeds are featured as preferred speeds at the individual level, that is, cruising speeds rather than instantaneous speeds. e traffic theory in the model pertains to a specific kind of congestion among the train-alighting users, with little or no disruption by other users waiting for boarding or staying on board. e model involves a simple topological configuration for the triple of train dwelling position, platform exit point, and station access point (validation gates). When dealing with a platform exit point situated at an intermediary position on the dwelling length, the alighting positions must be considered with respect to that point.
As for the estimation methodology, it requires the identification of the abovementioned triple in the dataset of train egress times. Such dataset enables one to identify all except one components in the parameter vector of the walklength pair distribution, and all of the queuing parameters when queuing occurs.
Our four cases of model application demonstrate the model ability to simulate the empirical distribution of egress times in an efficient way. Both the walking features and the queuing characteristics were uncovered, with outcomes endowed with much plausibility. e assumption of a bivariate Gaussian distribution of walk lengths and free-flow speeds is merely instrumental: it enables for straightforward interpretation of the estimation results as well as for easy computation. More general distribution may also be considered: yet the computational cost would be higher. e discontinuity of the likelihood function for the congested model is more problematic. It may be attacked from three different sides: either by refining the physical theory to smooth out the contours of the queuing episode, or by refining the stochastic theory to allow for some speed variability at the individual level, or by utilizing a more sophisticated optimization algorithm to deal with discontinuity and local optima (e.g., a genetic algorithm).
Other directions for further research include the following: (i) e critical review of the likelihood function, with special attention to the queuing part and the associated assumptions of independence (ii) e consideration of nonparametric estimation for the walk and speed distribution (iii) e theorization of other kinds of passenger congestion, involving the alighting users in interaction with other platform users that wait for boarding or simply go through it form an entry point to another exit point, and maybe also with those train users remaining on board