Calibrating Path Choices and Train Capacities for Urban Rail Transit Simulation Models Using Smart Card and Train Movement Data

,


Introduction
Urban rail systems are important components of the urban transportation system.Given their high reliability and large capacity, they have attracted high passenger demand.However, high demand also leads to problems such as overcrowding and disruptions, which decrease the level of service and impact passengers.To maintain service reliability and develop efficient response strategies, it is crucial for operators to better understand passenger demand and flow patterns in the network.
Transit network loading (or simulation) models for metro systems, powered by automated collected data, provide a useful instrument for network performance monitoring.ey enable operators to characterize the level of service and make decisions accordingly.A typical network loading model requires origin-destination (OD) matrix, supply information, and path choice fractions as input.e supply information includes the transit network topology, actual vehicle movement data, and vehicle capacity.anks to the wide deployment of automated fare collection (AFC) and automated vehicle location (AVL) systems, the OD demand and train movement data can be directly obtained.However, obtaining the corresponding path choices and quantifying reasonable vehicle capacity remains a challenge.According to Liu et al. [1] and Preston et al. [2], train capacity, defined as the maximum train load when remaining passengers in the platform denied boarding, may vary depending on the crowding levels in trains and on platforms and passenger attitudes.e calibration of path choices and train capacity can improve the accuracy of network loading models for performance monitoring.us, these models can provide better information to operators to adjust operating strategies, relieve congestion, and improve efficiency.
Traditionally, path choices are inferred with data from on-site surveys that are used to estimate path choice models.However, surveys are time-consuming and labor-intensive, limiting their real-world usage.To overcome these disadvantages, path choice estimation methods based on AFC data have been proposed in the literature.AFC systems provide the exact locations and times of passengers' entry and exit transactions, which can be used to extract OD demand and passengers' journey times.ey provide rich information for analyzing passenger behavior [3].
In an urban rail system operated near its capacity, five critical parameters are correlated with each other: OD demand, journey time, left behind (or denied boarding), path choices, and train capacity.
e relationship of these parameters can be explained in Figure 1.OD demand is the input and journey time is the output (OD exit flow is a combination of the two), which can both be observed from the AFC data.Path choices, train capacity, and left behind are not observable in the AFC data.Journey time is directly affected by path choices and left behind (left behind can increase the waiting time).Left behind is directly affected by path choices and train capacity.
is figure indicates the complexity of path choice estimation using AFC data.e dependencies of different parameters (e.g., path choices vs. train capacity) should be captured.
In the context of path choice estimation, the AFC databased methods can be categorized into two groups: pathidentification methods [4][5][6][7] and parameter-inference methods [8][9][10][11][12].e former studies aim to identify the exact path chosen by each user and even the train they boarded.Path attributes are used to evaluate how likely a path is chosen for a passenger's trip from their observed origins to their observed destinations.e latter studies formulate probabilistic models to describe passengers' decision-making behavior.Bayesian inference is usually used to estimate the corresponding parameters and thus derive the path choice fractions.Despite using different methods, the key components for those AFC data-based studies are similar.ey all attempt to match the modelderived journey times with the observed journey times from AFC data.However, many of these studies either assume a known constant train capacity or specify a known linkimpedance function.As shown in Figure 1, journey times depend on both path choices and train capacity.An unreasonable setting of train capacity may cause calibration bias of path choices.Simultaneous calibration of both parameters is more reasonable.
Train capacity is a vague concept.Normally trains may not reach their designed physical capacity for various reasons (e.g., passengers may decide not to board due to the crowding Liu et al. [1].erefore, assuming a fixed physical capacity or fixed link-impedance function (in many previous studies) may not be a reasonable assumption in real-world situations, only a few studies have explored the calibration of actual train capacity in the rail system.Liu et al. [1] proposed the concept of "willingness to board" (WTB) to describe the varied capacity in a bus system and estimated passengers' WTB using a least square method.Xu and Yong proposed a passenger boarding model which revealed that the number of actually boarding passengers in a crowded train was closely related to the number of queuing passengers and train load.Mo et al. [13] proposed an effective capacity model that recognized train capacity may vary across stations depending on the corresponding number of queuing passengers and train load.e calibration of train capacity or WTB usually requires the AFC data with passengers' boarding and journey time information.However, this information may also be affected by path choices, which were neglected in previous studies.
To fill these research gaps, we propose a simulationbased optimization (SBO) model to calibrate path choices and train capacity simultaneously and also explore the efficiency of typical SBO solution algorithms.e calibration problem is formulated as an optimization problem using AFC and AVL data.e formulation can capture the interaction among these variables and their impact on journey times.Seven optimizers (solving algorithms) from four branches of SBO-solving methods are implemented for comparative analysis.ey include generic algorithm (GA), simulated annealing (SA), Nelder-Mead simplex algorithm (NMSA), mesh adaptive direct search (MADS), simultaneous perturbation stochastic approximation (SPSA), Bayesian optimization (BYO), and constrained optimization using response surfaces (CORS).We compare these SBO solving algorithms within a limited computational budget, defined by the number of function evaluations.Data from the Hong Kong Mass Transit Railway (MTR) system provide the foundation for a realistic case study.e major contribution of this paper is twofold: (i) Proposing an optimization model to simultaneously estimate path choices and train capacities using AFC and AVL data, it addresses the typical assumption of fixed and known train capacities in existing path choice estimation studies using smart card data (ii) Validating the model using a busy urban rail network and analyzing the performance of SBO solution algorithms using systematic experiments, it represents different degrees of users' randomness in path choice and their sensitivity to crowding e remainder of the paper is organized as follows.In Section 2, we illustrate the SBO problem formulation.

2
Journal of Advanced Transportation Section 3 briefly describes the various SBO methods used in this study.e proposed framework is used in a case study with data from the Hong Kong MTR network in Section 4. e results are used to compare the performance of different algorithms.Section 5 concludes the paper by summarizing the main findings and discussing future research directions.

Methodology
e paper aims to calibrate simultaneously the train capacity and path choices using readily available data in the closed fare payment systems (require ticket validation at both tapin and tap-out stations).To capture the interaction among different variables in Figure 1, we use a schedule-based network loading model with capacity constraints (described in Section 2.1).It outputs a list of performance metrics given a set of inputs including OD demand, timetables/AVL, network, train capacity, and path choices.e calibration of path choice and capacity is formulated as an optimization model that attempts to minimize the error between network loading model outputs (e.g., journey time, which is a function of path choices and train capacity) and the corresponding quantities directly observed from the AFC data.

Transit Network Loading Model.
Transit network loading (TNL) models aim to assign passengers over a transit network given the (dynamic) OD entry demand and path choices.In this study, we adopt an event-driven schedulebased TNL model proposed by Mo et al. [13].e model takes OD entry demand (number of tap-in passengers by time), path choices, train arrival and departure times from stations, train capacity, and infrastructure information (e.g., network topology) as inputs and outputs the passengers' tapout times, train loads, waiting times, and other network performance indicators of interest.
Figure 2 illustrates the main functions of the TNL model [13].ree objects are defined: train, waiting queue (on the platform), and passengers.An event is defined as a train arrival at, or departure from, a station.Events are ordered chronologically.New and transferring passengers join the waiting queue on the platform and board a train based on a first-come-first-board (FIFB) discipline.
e number of successfully boarding passengers depends on the available train capacity.
e TNL model works by generating a train event list (arrivals and departures) based on the actual train movement data (AVL) and then sequentially processing the ordered events until all events are processed for the time period of interest.e processing of an individual event is based on the following rules: (i) If the event is an arrival (Figure 2(a)), the train offloads passengers and updates its state (e.g., train load and in-vehicle passengers).Alighting passengers who need to transfer are assigned to the waiting queues on the corresponding transfer platforms (e.g., passengers transferring to platform B in Figure 2(a)).Passengers who tap out will be removed from the system.New tap-in passengers who entered the station between two events are added into the queue (e.g., new tap-in passengers in platform A in Figure 2(a)).en, the waiting queue objects for all platforms are updated accordingly.(ii) If the event is a departure (Figure 2(b)), passengers board trains based on a FIFB priority rule.If the onboard passengers reached the train capacity, the remaining passengers at the platform will be denied boarding and wait for the next available train.Finally, the state of the train (train load and in-vehicle passengers) and the waiting queue at the platform are updated accordingly.
More specifically, for each passenger in the simulation model, we first calculate his/her probability of choosing each available path based on the path's attributes and path choice parameters (see Section 2.2, for details).Path attributes include in-vehicle time, number of transfers, and transfer walking time.
en, each passenger is assigned with a specific path based on the choice probability.Based on the path information, the passenger walks to a specific platform, joins the waiting queue, and waits for available trains to board.e boarding and alighting behavior are as described above.

Problem Formulation.
Consider a general urban rail network in a specific time period T, represented as G � (S, A), where S is the set of stations and A is the set of directed links.We divide T into several time intervals with equal length τ (e.g., τ � 15 min).Denote the set of all time intervals as T � 1, 2, . . ., T/τ { }.Define a time-space (TS) node as i m , where i ∈ S and m ∈ T. i m represents station i in time interval m.
For an OD pair (i, j) (i, j ∈ S), the OD entry flow (q i m ,j ) represents the number of passengers entering station i during time interval m and exiting at station j.Let the set of all OD entry flows be q e .e OD exit flow (q i,j n ) represents the number of passengers who exit at station j in the time interval n with origin i. q i m ,j and q i,j n are inputs and outputs of the TNL model, respectively.
Let the set of all paths between (i, j) be R(i, j).We assume that the path choice behavior can be formulated as a C-logit model [14], which is an extension of the multinomial logit (MNL) model to correct the correlation among paths due to overlapping [15].e path choice fraction for path r ∈ R(i, j) in time interval m (p i m ,j r ) is formulated as follows: where μ is the scale parameter of the Gumbel distribution of the error term [16], which is usually normalized to 1. Larger (smaller) μ means the choice behavior is more deterministic (random).X r,m is the vector of attributes for path r in time interval m (e.g., in-vehicle time, number of transfers, and transfer walking time).CF r is the commonality factor of path r which measures the degree of similarity of path r with Journal of Advanced Transportation the other paths of the same OD.β X and β CF are the corresponding coefficients to be estimated.Let β be the vector that combines β X and β CF (i.e., β � [β X , β CF ]).CF r is defined as follows: where D r,r′ is the number of common stations of path r and r ′ , D r and D r′ are the number of stations for path r and r ′ , respectively, and c is a fixed positive parameter.Let the set of all path choice fractions be p. e values of β can be bounded from above and below.e boundaries can be obtained from the prior knowledge and previous survey results.Denote the upper bound as U β and lower bound as L β (L β ≤ β ≤ U β ), where U β and L β are both vectors with the same cardinality as β.
According to Mo et al. [13], the actual train capacity utilized by passengers is determined by three factors: (a) waiting passenger distribution on the platform, (b) train load and distribution across the train, and (c) passengers' willingness to board a crowded train.us, train capacity is not constant.Instead, it is dynamic and changes across stations and trains depending on the crowding level of the train and the platform.Mo et al. [13] model the capacity of train k at station i (C k,i ) is as where n i is the number of cars of train i, H k,i is the load of train k when it arrives at station i, Q k,i is the number of passengers waiting on the platform when train k arrives at station i, and θ 0 , θ 1 , and θ 2 are parameters to be estimated (θ 0 , θ 1 , θ 2 > 0).Specifically, θ 0 is a measure of service standard.θ 0 n i can be seen as the base capacity, that is, the train load that represents acceptable service standards.At uncongested stations, passengers are assumed not to board when the train load is greater than θ 0 n i .At congested stations, passengers may still board a train even if it is already crowded [1], which makes the effective train capacity higher than θ 0 n i .θ 1 captures the effect that the effective capacity is higher when train load is higher.is is because passengers may worry if they did not board this crowded train, and they cannot board the following trains as well [1].θ 2 captures the effect that more waiting passengers at the platform may push more passengers to board, leading to higher effective capacity.
In the discussion that follows, let θ be the vector of these three parameters.We assume that the values that these parameters can take is L θ ≤ θ ≤ U θ , where L θ and U θ are the corresponding lower and upper bounds, respectively.e goal is to calibrate θ and β vectors (used by the TNL model) based on indirect observations.Two sets of observations are used for the calibration: observed OD exit flows and observed journey time distribution (JTD).Both of them can be obtained from the AFC data.
Let the ground truth (observed) OD exit flow be  q i,j n .Let f i,j t (x) be the model-derived JTD of passengers with origin i who exit at station j during time interval t.Let  f i,j t (x) be the corresponding observed JTD extracted from the AFC data.Since f i,j t (x) and  f i,j t (x) are estimated from passengers' journey time observations, only the OD pairs with more than E passengers exiting in a specific time interval are considered, where E is a predetermined threshold to ensure enough sample size.Denote the set of corresponding OD pairs and exit time intervals as E, where E � (i, j n ):  q i,j n , q i,j n > E, ∀i,  j ∈ S, n ∈ T}.
e calibration problem is formulated as an optimization problem: e objective function (equation ( 4)) has two parts: the square error between model-derived OD exit flows and the corresponding observations and the difference between model-derived and observed JTD.w 1 and w 2 are weights used to balance the scale and the importance of the two parts.e difference of the two distributions is expressed using Kullback-Leibler (KL) divergence (D KL ): TNL(p, q e , θ) is the black-box function that corresponds to the TNL model, which can output the model-derived OD exit flows and JTD for a given set of path choices and train capacity.Since the TNL model has no analytic form, equation ( 4) is a SBO problem with upper and lower bound constraints.In the following section, we discuss seven different algorithms appropriate for the solution of SBO problems.
ese algorithms belong to four general approaches of SBO solving methods.
It is worth noting that X r,m (i.e., the path attributes vector) is known and fixed in this study.It is assumed to represent the historical path conditions based on which passengers make their habitual choices.Different from typical transit/traffic assignment problems where path choices are estimated by assuming user equilibrium (for planning purposes), the AFC data-based estimation aims to find the actual realized path choices based on real-world observations (i.e., OD entry-exit flows).Since passengers make decisions before knowing the actual travel or waiting times, X r,m should reflect passengers' historical perceptions of path attributes and should not change within the model estimation process.erefore, though C k,i captures the actual path crowding information, it should not be included in the path choice formulation as passengers make decisions before knowing the actual crowding.

Simulation-Based Optimization Algorithms
ere are four major classes of methods for solving the SBO problems, including the heuristic methods, direct search methods, gradient-based methods, and response surface methods (Osorio and Bierlaire [17]; Amaran et al. [18]).Heuristic methods are partial search algorithms that may provide a sufficiently good solution to an optimization problem, especially with incomplete or imperfect information or limited computation capacity.Direct search methods are derivative-free methods that are based on the sequential examination of trial points generated by a certain strategy.ey are attractive as they are easy to describe and implement.More importantly, they are suitable for objective functions where gradients do not exist everywhere.Gradient-based approaches (or stochastic approximation methods) attempt to optimize the objective function using estimated gradient information.
ese methods aim to imitate the steepest descent methods in derivative-based optimization.Finite difference schemes can be used to estimate gradients but they may involve a large number of Journal of Advanced Transportation expensive function evaluations if the number of decision variables is large.Response surface methods are useful in the context of continuous optimization problems.ey focus on learning input-output relationships to approximate the underlying simulation by a predefined functional form (also known as a metamodel or surrogate model).is functional form can then be used for optimization leveraging powerful derivative-based optimization techniques.
In this study, we use seven representative algorithms belonging to these four classes of SBO methods to address the aforementioned path choice and train capacity calibration problem.Table 1 summarizes the main characteristic of these algorithms.e summary of all algorithms is described in Table 1.
In the discussion that follows, let Θ be the combined vector of β and θ (i.e., Θ � [β, θ] is the vector of all coefficients to be estimated).Let N be the dimension of Θ (i.e., Θ ∈ R N ).

Genetic Algorithm (GA).
GA is a heuristic method for solving both constrained and unconstrained optimization problems, which belongs to the larger class of evolutionary algorithms inspired by natural selection, the process that drives biological evolution.e GA repeatedly modifies a population of individual solutions as an evolution process [26].e GA can be used to solve a variety of optimization problems that are not well suited for standard optimization algorithms, such as the SBO problem where the objective function (or constraints) is nondifferentiable and highly nonlinear.
e evolution starts from a population of randomly generated individuals and is an iterative process, with the population in each iteration called a generation.In each generation, the genetic algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation.Over successive generations, the population "evolves" toward an optimal solution.e genetic algorithm uses three main procedures at each step to create the next generation from the current population.(1) Selection: select the individuals, called parents, who contribute to the population of the next generation.Individuals with better objective function values are more likely to be selected.(2) Crossover: combine two parents to form children for the next generation.(3) Mutation: apply random changes to individual parents to form children.
In this study, we adopted a blend crossover and Gaussian mutation methods.e probability of crossover is set as 0.8 and the probability of mutating is set as 0.4.And, the population size is set as 6 given the limited computational budget.e algorithm is implemented by the Python deap package [19].

Simulated Annealing (SA)
. SA is a heuristic method for solving optimization problems [27].e method is based on the physical process of heating a material and then slowly lowering the temperature to decrease defects, thus minimizing the system energy.
At each iteration of the SA algorithm, a new point is randomly generated.e distance of the new point from the current point, or the extent of the search, is based on a probability distribution with a scale proportional to the temperature.A distorted Cauchy-Lorentz visiting distribution is used in this study [20].e algorithm accepts not only all new points that lower the objective function but also, with a certain probability, points that raise the objective function.By accepting points that raise the objective function, the algorithm avoids being trapped in local minima.An annealing schedule is selected to systematically decrease the temperature as the algorithm proceeds.As the temperature decreases, the algorithm reduces the extent of its search to converge to a minimum.
In this study, the SA algorithm in Python Scipy package is adopted for the implementation with all model parameters set as default [28].

Nelder-Mead Simplex Algorithm (NMSA).
NMSA is a simplex method for finding a local minimum [29].NMSA in N dimensions maintains a set of N + 1 test points arranged as a simplex.Denote the initial value of Θ as Θ ini .e initial simplex set (N + 1 points) is generated as , where e i ∈ R N is the unit vector in the ith coordinate and σ is the step size which is set as 0.05 in this study [21].
Based on the initial simplex, the model evaluates the objective function for each test point, in order to find a new test point to replace one of the old test points.
e new candidate can be generated through simplex centroid reflections, contractions, or other means depending on the function value of the test points.e process will generate a sequence of simplexes, for which the function values at the vertices get smaller and smaller.e size of the simplex is reduced, and finally, the coordinates of the minimum point are found.
Four possible operations, reflection, expansion, contraction, and shrink, are associated with the corresponding scalar parameters: α 1 (reflection), α 2 (expansion), α 3 (contraction), and α 4 (shrink).In this study, we set the value of these parameters as α 1 , α 2 , α 3 , α 4   � 1, 2, 0.5, 0.5 { } as suggested in [21].e algorithm is implemented by the Python scikit-learn package with all parameters set as default.Since NMSA is designed for unconstrained problem, we turned the bound of Θ into a big penalized term in the objective function for this algorithm.For more details regarding the NMSA, one can refer to [21].

Mesh Adaptive Direct Search (MADS).
e MADS algorithm is a direct search framework for nonlinear optimization [30].It seeks to improve the current solution by testing points in the neighborhood of the current point (the incumbent).
e neighborhood points are generated by moving one step in each direction from the incumbent on an iteration-dependent mesh.Each iteration of MADS consists of a SEARCH stage and an optional POLL stage.e SEARCH stage evaluates a finite number of points proposed by the searching strategy (e.g., moving one step around from 6 Journal of Advanced Transportation the current point).Whenever the SEARCH step fails to generate an improved mesh point, the POLL step is invoked.e POLL step conducts local exploration near the current incumbent, which also intends to find an improved point on the mesh.Once an improved point is found, the algorithm updates the current point and constructs a new mesh.According to [30], the mesh size parameters approach zero as the number of iteration approaches infinity, which demonstrates the convergence of the MADS algorithm.
In this paper, we use a variant of the MADS method called ORTHO-MADS, which leverages a special orthogonal positive spanning set of polling directions.More details regarding the algorithm can be found in [22].NOMAD 3.9.1 [31] with the Python interface is used for the MADS algorithm application.e hyperparameters are tuned based on the NOMAD user guide.e direction type is set as orthogonal, with N + 1 directions generated at each poll.Latin hypercube search is not applied.

Simultaneous Perturbation Stochastic Approximation (SPSA). SPSA is a descent direction method for finding local minimums. It approximates the gradient with only two measurements of the objective function, regardless of the dimension of the optimization problem. Denote the objective function in equation (4) as Z(Θ).
e estimated parameters in the kth iteration is denoted as Θ (k) .en, one iteration for the SPSA is performed as where where Δ k is a random perturbation vector, whose elements are obtained from a Bernoulli distribution with the probability parameter equal to 0.5.α, c, a, c, A   are tuned as 0.602, 0.101, 0.001, 0.007, 0.1M { } in this study according to the numerical tests and guidelines from prior empirical studies of Gomez-Dans [32].M is the maximum number of iterations.

Bayesian Optimization (BYO)
. BYO constructs a probabilistic model of the objective function and exploits this model to determine where to evaluate the objective function for the next step.e philosophy of BYO is to use all of the information available from previous evaluations, instead of simply relying on the local gradient and Hessian approximations.is enables BYO to find the minimum of difficult nonconvex functions with relatively few function evaluations.
BYO assumes a prior distribution for the objective function values and uses an acquisition function to determine the next point to evaluate.In this study, we use the Gaussian process as the prior distribution for the objective function due to its flexibility and tractability.For the acquisition function, we tested three common criteria: probability of improvement (POI), expected improvement (EI), and upper confidence bound (UCB) [24].e EI criterion is used in this path choice calibration problem due to its best performance in our problem.e BYO is implemented in Python with the bayes_opt package.More details regarding the BYO can be found in [24].

Constrained Optimization Using Response Surfaces (CORS).
CORS is a response surface method for global optimization.In each iteration, it updates the response surface model based on all previously probed points and selects the next point to evaluate.e principles for the next point selection are (a) finding new points that have lower objective function value and (b) improving the fitting of the response surface model by sampling feasible regions where little information exists.Hence, the next point is selected by solving the minimization problem of the current response surface function subject to constraints that the next point should be more than a certain distance away from all previous points [25].
An algorithm following the CORS framework requires two components: (a) a scheme for selecting an initial set of points for objective function evaluation and (b) a procedure for globally approximating the objective function (i.e., a response surface model).In this study, the initial sampling is conducted using the Latin hypercube methods, with the initial sampling number equal to 0.2 × the total number of function evaluations allowed.
e radial basis function (RBS) is used as the response surface model.For the subsequent sampling, a modified version of the CORS algorithm with space re-scaling is used.Details about the algorithm can be found in [25,33].

Case Study
e proposed modeling framework is tested using data from the Hong Kong MTR network.MTR is the operator of the Hong Kong urban rail network, which provides services for the urbanized areas of Hong Kong Island, Kowloon, and the New Territories.e system currently consists of 11 lines with 218.2 km (135.6 miles) of rail, serving 159 stations including 91 heavy rail stations and 68 light rail stops.It serves over 5 million trips on an average weekday.Most of the passengers use a smart card fare payment system named Octopus.For the urban heavy rail lines, trip transactions are recorded when passengers enter and exit the system, providing information about the tap-in and tap-out stations and corresponding timestamps.

Experimental Design.
We use AFC data on a typical weekday afternoon peak period (18 : 00-19 : 00) in March 2017 for the model application.Li [34] conducted a revealedpreference (RP) path choice survey of more than 20,000 passengers in the MTR system and used them to estimate a path choice model.e estimation results are shown in A. e following attributes were used in the specification of the model: (a) total in-vehicle time, (b) the number of transfer times, (c) relative walking time (total walking time divided by total path distance), and (d) the commonality factor (equation ( 2)).Future research may consider more path choice attributes such as perceived crowding levels and estimated waiting times.
As the real-world path choice information and train capacity are usually unavailable, we validate the models with synthetic data.To generate the synthetic data, we first extract the OD entry flow (q i m ,j ) from the real-world AFC records.We assume a synthetic Θ as the "true" path choice and train capacity parameters.e TNL model with the true OD entry flow, train timetable, and synthetic Θ as inputs is used to simulate the travel of passengers in the system and record people's tap-in and tap-out time.
e input timetable is treated as the synthetic AVL data.e resulting passengers' tap-in and tap-out times are treated as the synthetic AFC data.e synthetic data, including "true" passenger path choices and train capacity, are used to evaluate the performance of the model under the various solution algorithms.All OD pairs of the whole network are considered in the experiments.
To compare the different SBO solving algorithms, we design five test scenarios summarized in Table 2.Each scenario has a different synthetic Θ. e selection of synthetic Θ can represent different assumptions about passengers' choice behavior and sensitivity to crowding.For the reference scenario, we use the path choice parameters in Table 3 as the synthetic β and use the estimated train capacity parameters in [13] as the synthetic θ.
Passengers' actual path choice behavior is assumed to be random (each path is equally likely to be selected) or deterministic.For the random path choice scenario, we set all synthetic choice parameters as 0, which means all available paths are equally likely to be chosen.For the deterministic (the word "deterministic" here just represents the degree of randomness is low.e "truly" deterministic corresponds to all parameters go to ⟶ − ∞) path choice scenario, we set all synthetic choice parameters as the lower bounds (i.e., the maximum absolute value possible).Under this scenario, a slight difference in attributes between two paths can lead to a high difference in choice probability (i.e., this is close to passengers following the shortest path).As for the train capacity, the synthetic θ for these two scenarios is the same as the reference scenario.
Passengers' sensitivity to crowding may also vary.If all passengers are not sensitive to the crowding, train capacity can be modeled as a fixed value.However, if passengers become more sensitive to the crowding, the actual train capacity may largely depend on the crowding level in the train and on the platform.erefore, passengers' sensitivity to crowding can be reflected by the scale of θ 1 and θ 2 [13].For the crowding-sensitive scenario, we set the synthetic train capacity parameters as θ 0 � 225, θ 1 � 0.2, and θ 2 � 0.2.Compared to the reference scenario, θ 1 and θ 2 are higher to represent higher sensitivity.And, θ 0 is decreased to offset the capacity increase caused by the increase of θ 1 and θ 2 .As for the crowding-insensitive scenario, we set the synthetic train capacity parameters as θ 0 � 235, θ 1 � 0, and θ 2 � 0, which can be seen as a fixed-capacity model.

Case Study Settings.
e lower and upper bounds of all parameters (L β , U β , L θ , U θ ) are shown in Table 2. Θ ini is set as (L Θ + U Θ )/2 for all scenarios.To compare different algorithms, a fixed computational budget, 100 function evaluations, is applied to all algorithms.All algorithms except for NMSA (deterministic algorithm) are replicated for 5 times (with different random seeds) to decrease the impact of randomness.

Reference Scenario Results.
e convergence results of the reference scenario are depicted in Figure 3.Each point represents the average value over all replications.We found that the performance of different algorithms varied.Given the limited number of function evaluations, CORS, BYO, and SPSA converge to a relatively small objective function.GA, MADS, and SA have relatively large objective function values upon termination.In terms of convergence speed, the response surface methods (BYO and CORS) have the fastest convergence speed.
ey also reach the lowest objective function value. is is consistent with conclusions regarding the performance of the SBO algorithms when used in the transportation domains [17,[35][36][37].
Figure 3 also summarizes the behavior of the algorithm stability.e vertical line indicates the 1/4× standard deviations over the five replications.NMSA is a deterministic algorithm and not affected by randomness.BYO and CORS show high randomness in the first half iterations.However, as the number of function evaluations increases, the standard deviation of the objective function decreases, and the results become stable.GA, SA, and MADS are unstable compared to other algorithms. is means that the heuristic algorithms (GA and SA) are not suitable for the calibration problem studied in this paper.e instability of 8 Journal of Advanced Transportation MADS may be because it may converge to nonstationary points [38].Table 4 compares the parameters estimated by different algorithms with the synthetic ones.Although some algorithms can reach similar objective function values, they result in different estimated parameters.For example, CORS and SPSA have similar objective function values.However, SPSA performs better in path choice estimation, while CORS performs better in train capacity estimation.We also observe that the train capacity parameters are relatively harder to estimate.is may be because most of the stations in the rail system are not congested and all passengers can board the trains.us, the objective function is not very sensitive to train capacity parameters.

Impact of Randomness in Path Choice Behavior.
Figure 4 shows the estimation results for the two path choice-related scenarios: random and deterministic.e estimated parameters are shown in Tables 5 and 6.For the       Journal of Advanced Transportation random scenario, all "true" (synthetic) path choice parameters are set as zero, which means all paths are equally likely to be chosen.We observe that, in this scenario (Figure 4(a)), CORS and SA algorithms perform the best with the lowest objective function.Compared to the reference scenario in Section 4.3, the decreased performance of BYO and SPSA may be due to the "true" β is close to the upper bound (U β � 0). e Gaussian posterior distribution in BYO and gradient estimation in SPSA can suffer from instability in the boundary.From Table 5, we observe the   e initial objective function is relatively small (1.5 × 10 5 ) compared to the reference scenario (1.5 × 10 6 ).All algorithms only reduce the objective function by around 1/3 except for the CORS algorithm.e good performance of CORS may come from global searching with the Latin hypercube method.It is better suited to explore the points near boundaries.Although the objective function does not decrease too much, the estimated parameters are still acceptable (see Table 6).

Impact of Crowding Sensitivity.
Figure 5 shows the estimation results of the two scenarios related to train capacity (i.e., crowding-sensitive and crowding-insensitive).In the crowding-insensitive scenario (Figure 5(a)), the conclusions are similar to the reference scenario.CORS, BYO, NMSA, and SPSA converge to low objective function values and outperform other algorithms.
e performance of NMSA and MADS is improved compared to the reference scenario.In the crowding-sensitive scenario, we still observe a good performance by the CORS, NMSA, and SPSA algorithms.e performance of BYO is slightly reduced.e results shown in Tables 7 and 8 indicate that θ 0 (base capacity) is hard to estimate.is may be because trains at most stations do not reach the capacity.erefore, for many OD pairs, the OD exit flows (directly related to the objective function) are not sensitive to the base capacity parameter.

Conclusion
In this paper, we propose an SBO framework to calibrate train capacity and path choice model parameters simultaneously for metro systems using AFC and AVL data.e advantage of the proposed framework lies in capturing the collective effect of both path choices and train capacity on passenger journey times.Seven representative algorithms from four main branches of SBO methods are applied and compared with respect to their solution accuracy, convergence speed, and stability.We applied the proposed framework using data from the Hong Kong MTR network and compared the performance of the different algorithms.Overall, the results show that some algorithms result in a reasonable estimation of the parameters of interest.ese results also support the effectiveness of the proposed SBO framework for calibrating these key parameters using AFC and AVL data.Especially, the response surface methods (particularly CORS) exhibit consistently good performance.
e SBO framework is flexible to accommodate a wide range of path choice and train capacity models in transit simulation models.
is paper has some limitations.First, we validate the framework and evaluate the algorithmic performance only using synthetic AFC and AVL data.erefore, the complexities of noise and uncertainties in actual data do not play any role.is is caused by the absence of real-world path choice and train capacity information.Future research can collect real-world path choice and train capacity data to conduct more realistic model validation.Second, we assumed that the path choice behavior is similar for the whole network (same β values).Given the real-world path choice behavior is possibly more diverse and heterogeneous, future research can explore clustering different OD pairs with different β values based on individual mobility characteristics [39].

Figure 1 :
Figure 1: Relationship among critical parameters in urban rail systems.

Figure 2 :
Figure 2: Main functions of the event-based transit network loading model.(a) Train arrival.(b) Train departure.

Figure 3 :
Figure 3: Convergence results of reference scenario.e error bar indicates 1/4× standard deviation.NMSA has no error bar because it is a deterministic algorithm.

Figure 4 :
Figure 4: Algorithm performance in the two path choice scenarios.(a) Random.(b) Deterministic.

Figure 4 (
b) shows the results of the deterministic scenario.

Table 3 :
Path choice model estimation results.

Table 4 :
Estimation results of the reference scenario.

Table 5 :
Estimation results of the random path choice scenario.

Table 6 :
Estimation results of the deterministic path choice scenario.

Table 7 :
Estimation results of the crowding-insensitive train capacity scenario.

Table 8 :
Estimation results of the crowding-sensitive train capacity scenario. of in-vehicle time and number of transfers are better estimated than those of relative walking time and commonality factors. parameters