Calibration of Microscopic Traffic Flow Simulation Models considering Subsets of Links and Parameters

This study proposes a methodology for the calibration of microscopic traﬃc ﬂow simulation models by enabling simultaneous selection of traﬃc links and associated parameters. The analyst selects any number and combination of links and model parameters for calibration. Most calibration methods consider the entire network and use ad hoc approaches without enabling a speciﬁc selection of location and associated parameters. In practice, only a subset of links and parameters is used for calibration based on several factors such as expert knowledge of the system or constraints imposed by local governance. In this study, the calibration problem for the simultaneous selection of links and parameters was formulated using a mathematical programming approach. The proposed methodology is capable of calibrating model parameters considering multiple time periods and performance measures simultaneously. Traﬃc volume and speed are the performance measures used in this study, and the methodology is developed without considering the characteristics of a speciﬁc traﬃc ﬂow model. A genetic algorithm was implemented to ﬁnd a solution to the proposed mathematical program. In the experiments, two traﬃc models were calibrated: the ﬁrst set of experiments included selection of links only, while all associated parameters were considered for calibration. The second set of experiments considered simultaneous selection of links and parameters. The implications of these experiments indicate that the models were calibrated successfully subject to selection of a minimum number of links. As expected, the more links and parameters that are used for calibration, the more time it takes to ﬁnd a solution, but the overall results are better. All parameter values were reasonable and within constraints after successful calibration.


Introduction
Microscopic traffic flow simulation is increasingly being used to analyze complex scenarios for a broad range of objectives. One of the most important and challenging aspects for obtaining meaningful results is calibration, which involves adjusting the model parameters to enhance the ability of the model to generate local traffic conditions [1][2][3]. Existing calibration approaches propose various optimization algorithms and varying sets of calibration parameters. Sequential as well as simultaneous calibration of model parameters are proposed in the literature. e calibration approach provided by the Federal Highway Administration (FHWA) in Traffic Analysis Toolbox Volume IV suggests a sequential process of calibrating the capacity at key bottlenecks, traffic volumes, and system performance [2]. Using this approach, model parameters are adjusted by modifying global parameters first, then link parameters, and finally route choice parameters. Ma et al. [4] used a sequential approach to calibrate global and local parameters separately. Jha et al. [5] calibrated driver-behavior parameters separately from other parameters, such as route choice factors and origin-destination (O-D) flows. Paz et al. [3,6] used an iterative approach where one group of parameters was calibrated, while others remained fixed. Issues associated with the use of a sequential calibration process include difficulty to achieve convergence and stable solutions [6].
Many mathematical programming formulations have been proposed to characterize and solve the problem of calibrating simulation-based traffic flow models. A simplex algorithm was proposed to calibrate microscopic traffic flow simulation models using intelligent transportation system data [7]. e proposed algorithm was very effective for congested conditions compared to simple manual calibration techniques. However, this effectiveness decreased as congestion decreased. e proposed approach considers only a single objective to minimize the difference between observed and estimated volume. In practice, multiple objectives are likely to be required.
Various genetic algorithms (GAs) have been proposed to calibrate microscopic simulation models [4,5,[8][9][10][11][12][13][14] with successful results and relatively faster convergence. Yang et al. [15] proposed an orthogonal genetic algorithm (OGA) that provided superior results when compared to a GA; however, the number of calibrated parameters was few. In contrast, GA was found to converge relatively quickly for simulation models with many parameters [10]. Simultaneous perturbation stochastic approximation (SPSA) algorithms have also been widely used to calibrate microscopic simulation models [4,[16][17][18][19]. SPSA was found to provide a similar level of accuracy, fewer iterations, and less computation time than GAs and the trial-and-error iterative adjustment (IA) algorithms [16]. A memetic algorithm (MA) was found to be superior to a SPSA algorithm because the fine-tuning process required was significantly quicker for MA [3]. Cobos et al. [20] found that when a MA was adapted, using Solis and Wets local search chains (MA-SW-Chains), the results provided better and faster convergence compared to both SPSA and MA. A multiobjective MA based on NSGA-II and simulated annealing (NSGA-II-SA) also offered better results for runtime and convergence compared to a single-objective MA [21]. Considering that the performance of the calibration process and the time invested in finding the correct set of hyperparameters are correlated and affected by the characteristics of each metaheuristic, 17 alternative algorithms including multi-and mono-objective approaches were evaluated [22]. An adaptation of the global-best harmony search provided the best results considering both stability and dominance.
Microscopic traffic flow simulation models use the concept of car-following and lane-changing theories to represent vehicle interactions and driver-behavior dynamics [2,23]. Typically, calibration parameters are related to driver characteristics, such as car-following behavior and gap acceptance. Balakrishna et al. [16] proposed the calibration of demand-and-supply parameters simultaneously. However, the calibration was performed only with link counts and used precalibrated values for the driver-behavior parameters. Cheu et al. [8] used parameters such as free-flow speeds, car-following distance, car-following sensitivity factors, lagto-accelerate/decelerate factors, and lane-changing factors. Results showed that free-flow speeds, car-following distance, and car-following sensitivity factors had the most effect and are important for calibration; thus, calibration could be performed using only these three parameters. Ma and Kim [4,13] considered calibration parameters that were associated with acceleration/deceleration, car-following, and lane-changing behaviors. e lane-change probability and car-following distance were found to have relatively close calibrated and default values, suggesting that calibration could be performed without the inclusion of these parameters. Performance measures after calibration showed consistency with actual field values; however, no standard criteria for calibration were defined. Paz et al. [3,6] calibrated microscopic traffic flow models by taking into consideration the entire set of model parameters simultaneously. e simultaneous selection of all parameters was motivated by the need to seek convergence and stability of the solutions. All parameters were treated equally, and a subset of parameters that may significantly affect a traffic model was not identified. Kim [13] used a bilevel framework to calibrate driver-behavior parameters and O-D demand simultaneously. e calibration was performed only on a congested network.
State-of-the-art methods take into consideration sets of links and parameters for calibration without providing flexibility for selecting or constraining the search space in terms of where and what to use to fine-tune the traffic flow simulation model. In practice, only a subset of links and parameters can be used for calibration; for example, certain links of a network may be precalibrated, and/or default or prespecified values are required by local governance. at is, development and calibration may be restricted to adjust only a subset of all the potentially available parameters in a traffic flow model. Based on local knowledge and experience, key parameters and specific traffic facilities are selected or allowed for calibration [13,24]. While a large number of parameters increase computational complexity, identifying a subset of important parameters mitigates this problem and increases the ability of an algorithm to find a global optimum [25]. When all the parameters are calibrated simultaneously, lesser-known parameters may yield values that are unexplainable or inconsistent with real-world traffic behavior.
Unlike traditional approaches that either involve a sequential process or consider all parameters simultaneously, this research proposes a methodology that enables the simultaneous selection of specific links/facilities and parameters for calibration.
at is, any combination of traffic facilities and model parameters within each facility can be selected simultaneously for calibration. Local and global calibration parameters were taken into consideration. e capability of selecting where and what to calibrate was motivated by requirements to use local knowledge and governance in order to select parameters for calibration. is is of practical and theoretical importance, and these analyses and associated insights are missing in the literature. Our experiments illustrate the consequences of selecting only a subset rather than all parameters.

Methods
e calibration methodology used in this study was adapted from Paz et al. [3]. is modified approach has the capability to select links and model parameters. e calibration problem was formulated using a mathematical programming approach. e normalized root mean square, which was the objective function for this study, measured the relative difference between actual and simulated traffic volumes and speeds. Normalization allowed multiple performance measures to be considered simultaneously [3].

Problem Formulation
2.1.1. Notation and Terms. In this study, any number and combination of local and global parameters could be selected for calibration. Indicator variables δ p k and δ g were used to define which parameters were selected. e following are the notations and terms used in this study: K: set of links selected for calibration k: subscript for a link selected for calibration, k ∈ K P: set of local model parameters p: superscript for a local model parameter, p ∈ P α p k : local parameter p on link k selected for calibration, ∀k ∈ K and p ∈ P α k : set of local parameters on link k selected for calibration, ∀k ∈ K δ p k : indicator variable for local parameter p on link k selected for calibration, ∀k ∈ K, p ∈ P, and δ p k � 1⇔α p k ∈ α k ; otherwise, δ p k � 0 α: set of local parameters selected for calibration, α ∈ P G: set of global model parameters g: superscript for a global model parameter, g ∈ G β g : global parameter selected for calibration g, ∀g ∈ G β: set of global parameters selected for calibration, β ∈ G δ g : indicator variable for global parameter g selected for calibration, ∀g ∈ G and δ g � 1⇔β g ∈ β; otherwise, δ g � 0 θ: set of all parameters selected for calibration, θ � α ∪ β L: set of links with actual field data l: subscript for a link with actual field data, l ∈ L T: total number of time periods t: subscript for a time period, t ∈ T V l,t : actual volume for link l at time period t, ∀t ∈ T and l ∈ L S l,t : actual speed for link l at time period t, ∀t ∈ T and l ∈ L W v : weight factor for volumes V(θ) l,t : simulated volume for link l at time period t, ∀t ∈ T and l ∈ L S(θ) l,t : simulated speed for link l at time period t, ∀t ∈ T and l ∈ L

Mathematical Program.
e objective function and the calibration criteria were evaluated using links L with the actual field data that were available.
(1) Objective Function. e objective was to minimize the normalized weighted root-mean-square (NRMS) error over the number of time periods (T) and links (L) as follows: subject to lower bound ≤ α p k ≤ upper bound, ∀k ∈ K, p ∈ P, (6) lower bound ≤ β g ≤ upper bound, ∀g ∈ G.
is NRMS error function measures the relative difference between the estimated and the actual volume and speed values. e values in the squared root are the relative differences in volume and speeds for all links selected for calibration that contained actual field data. e relative differences are multiplied by W v and 1 − W v to consider the reliability of volume and speed data. is difference is also measured for all considered time periods. e total error is normalized by dividing it by the squared root of the number of links and time periods considered for calibration. e NRMS is based on a previous study [3], where this error function was used successfully to calibrate traffic flow models.
Constraints (2) and (3) ensured that the local parameters selected for calibration were included in vector θ. Similarly, constraint (4) ensured that the global parameters selected for calibration were included in vector θ. Constraint (5) was a definitional constraint for the calibration vector θ. Constraints (6) and (7) provided the lower and upper bounds for each parameter selected for calibration.

Calibration Criteria.
e criterion for calibration is based on guidelines provided by the FHWA [2]. For individual links, in more than 85% of cases, the difference between actual and simulated counts should be (i) Within 100 vehicles/hour for link volumes less than 700 vehicles/hour Journal of Advanced Transportation 3 (ii) Within 15% of field flow for link volumes between 700 and 2700 vehicles/hour (iii) Within 400 vehicles/hour for link volumes greater than 2700 vehicles/hour e sum of all simulated link count errors should be within 5% of all actual link counts. e GEH statistic for individual link flows should be less than 5 for more than 85% of cases [1,2]. e GEH statistic is given by where V l is the actual traffic volume for link l and V(θ) l is the corresponding simulated traffic volume.

Solution Algorithm.
e proposed mathematical program, as expressed in equations (1) through (7), was solved using a GA, which searches solutions by trying to avoid stopping at local optima and seeking to increase the probability of locating a global optimum [4, 5, 8-14, 26, 27]. In the context of the GA, a population is generated at random, initially. An individual (chromosome) in a population is composed by a set of calibration parameter values (genes) that represent a viable solution. Table 1 provides an example of an individual or chromosome used in this study. e parameters to be calibrated are organized into an array where specific positions are associated with certain links. e quality of the resulting solution is evaluated by a fitness or objective function, as in equation (1). GA creates successive generations of individuals, and the best individuals are stored to create a new population. e implemented GA expands the one proposed by Paz et al. [3] to address constraints (2)-(7) to enable section of links and calibration parameters. Figure 1 provides a flowchart of the GA solution algorithm.
Algorithmic steps are as follows: Step 1-initialization: an initial population of θ s is randomly generated but constrained to lower and upper bounds in order to maintain model realism.
Step 2-parents' selection: the best 60% of θ s from the initial population are saved. en, sets of θ are generated that represent parents in the population and are paired using a 'roulette wheel selection.' Step 3-crossover: a crossover is performed at 50%. is process combines parent θ s in order to generate new sets of calibration parameters (i.e., offspring).
Step 4-mutation: approximately 30% of the parameters of each offspring are subjected to small perturbations (±1%) in order to research neighboring solutions.
Step 5-population management strategy: the new offspring θ replaces the worst θ s when new θ provides a better fitness than older θ s .
Step 6-stopping criteria: if the stopping criterion is met, best θ is returned, and the algorithm ends.
Otherwise, it returns to Step 2. e stopping criterion is met by reaching convergence or a prespecified maximum number of generations/iterations. Convergence is researched when the calibration criteria listed above are met.

Experiments and Results
e proposed methodology and solution algorithm were tested using CORSIM models. CORSIM includes driverbehavior and vehicle performance parameters. Table 2 lists various calibration parameters in CORSIM [3]. Two CORSIM models are used in the experiments and are illustrated in Figure 2. Both models included arterial roads with signalized intersections. For signal-controlled intersections, one of the important parameters was the discharge headway of individual vehicles [2,28]. e Reno network ( Figure 2(a)) represents the Pyramid Highway in Reno, Nevada, and consists of 126 arterial links. Calibration field data were available for 45 of these links. e local parameters included the mean queue discharge headway and the mean value of start-up lost time. e global parameters included lane change, acceptable gap in nearside cross-traffic for vehicles at a sign, additional time for farside cross-traffic in the acceptable gap for vehicles at a sign, and the driver's familiarity with path distributions. e McTrans model (Figure 2(b)), provided by McTransTM, consisted of 20 arterial links. is is a well-known model of a synthetic network used only for demonstration and analysis purposes. e default parameters in the McTrans model were considered as calibrated conditions, and the outputs from this model were used as field data for the experiments. Model parameters were randomly modified to represent an uncalibrated model. e local parameters included mean queue discharge headway and mean start-up lost time. Global parameters included the driver's familiarity with path distributions and included the percentage of drivers who knew only one turn movement as well as the percentage of drivers who knew two turn movements.

Experimental Setup.
e proposed solution algorithm was implemented using Java ™ , which is capable of handling complex data structures and mathematical functions. As noted by Paz et al. [3], the implementation used a basic layered architecture, with each layer handling a group of related functions. Volume and speed data were used for the calibration. e CORSIM models were run for a simulation time period of 15 min. e first set of experiments incorporated a selection of links in the network, and the second set of experiments incorporated the simultaneous selection of links and parameters.

First Set of Experiments: Selection of Links in the Network.
In the first set of experiments, links were selected for calibration randomly. All global and local parameters for the selected links were considered simultaneously for calibration.
When 70% of links were randomly selected, both models were calibrated successfully. Figure 3 shows how the objective function converged when 70% of the links were selected for calibration. e normalized root mean square (NRMS) showed improvement over the iterations of the calibration process. For the Reno network, the initial value of NRMS was 0.22; after 845 iterations, the NRMS decreased to 0.08. For the McTrans model, the initial value of NRMS was 0.29; after 370 iterations, the NRMS decreased to 0.06. Figure 4 compares vehicle counts before and after calibration. Before calibration, there was a significant difference between the actual and the simulated counts. After calibration, the gap between the actual and simulated counts was reduced, as illustrated by their alignment along the 45°l ine in Figure 4. Figure 5 shows the vehicle speeds before and after calibration. For the Reno network, the speed data are scattered away from the 45°line more than the volume data. is is a consequence of a higher weight assigned in the objective function to volume than speed. Volume data correspond to vehicle counts, while speed is a spot mean measure which is not representative of the actual speed for the entire link. Figure 6 shows the GEH statistic for the models before and after calibration. For the Reno network, the initial GEH value was less than 5 for 46% of the selected links. After calibration, the GEH value was less than 5 for 93% of the selected links. For the McTrans model, the initial GEH value was less than 5 for 55% of the selected links. After calibration, the GEH value was less than 5 for 100% of the links. Table 3 outlines the percentage of selected links and the corresponding calibration results when all the parameters were selected simultaneously for calibration. Both models were calibrated successfully when at least 60% of the links were selected. For illustration purposes, Appendix provides the calibration parameters used in the first set of experiments including upper and lower bounds as well as values before and after calibration. All calibrated values are within the accepted range.  836  912  837  875  642  Link  1  2  3  1  2  3  1  2  3  1  2  3 Start End Yes

No
Step 1: initialization generate initial population of θ s randomly.
Step 2: parent's selection evaluate sets of θ and save best 60%. select parent θ s randomly.
Step 3: crossover combine parent θs at 50% to generate new offspring θ s .
Step 5: population management strategy evaluate new offspring θ and replace older θ.
Step 6: stopping criteria convergence or max number of generations reached?

Second Set of Experiments: Simultaneous Selection of Links and Parameters.
In the second set of experiments, links and associated parameters were selected simultaneously. ese experiments were conducted using different combinations of parameters.

First Combination.
e local parameters were selected for every link, and the global parameters were set as the default. Table 4 shows the selected percentage of links and the corresponding results when only the local parameters were selected for calibration. e Reno network was calibrated successfully when at least 70% of links were selected for calibration. e McTrans model was successfully calibrated when at least 50% of links were selected for calibration.

Second Combination.
e mean queue discharge headway was selected as the only local parameter for calibration, and all the global parameters were considered. Table 5 provides the results when the mean queue discharge headway and all the global parameters were selected for calibration. Both models were calibrated successfully when at least 60% of the links were selected for calibration.

ird Combination.
Mean queue discharge headway and mean start-up lost time were selected as mutually exclusive; meanwhile, all the global parameters were considered for calibration. Table 6 provides the results from using various percentages of the mean queue discharge headway and mean start-up lost time when all global parameters were considered for calibration of the CORSIM models. e Reno  network was calibrated successfully when the mean queue discharge headway was selected for at least 80% of links; the mean start-up lost time was selected for the remaining 20% of the links. e McTrans model was calibrated successfully when the mean queue discharge headway was selected for at least 90% of links, and mean start-up lost time was selected for the remaining 10% of links.

Sensitivity Analysis.
Sensitivity analyses were conducted to observe the effects on NRMS on various percentage of links and several combinations of parameters selected for calibration. e results are illustrated in Figures 7 and 8. Figure 7 shows the effects on NRMS due to various percentages of links selected for calibration. In Figure 8,             Journal of Advanced Transportation     16 Journal of Advanced Transportation