Cross-Comparison and Calibration of Two Microscopic Traffic Simulation Models for Complex Freeway Corridors with Dedicated Lanes

Realistic microscopic traffic simulation is essential for prospective evaluation of the potential impacts of new traffic control strategies. Freeway corridors with interacting bottlenecks and dedicated lanes generate complex traffic flow phenomena and congestion patterns, which are difficult to reproduce with existing microscopic simulation models. This paper discusses two alternative driving behavior models that are capable of modeling freeways with multiple bottlenecks and dedicated lanes over an extended period with varying demand levels. The models have been calibrated using archived data from a complicated 13-mile long section of the northboundSR99 freeway near Sacramento,California, for an 8-hour time period in whichthe traffic fluctuatedfrom free-flow to congested conditions. The corridor includes multiple bottlenecks, multiple entry and exit ramps, and an HOV lane. Calibrationresultsshowextremelygoodagreementbetweenfielddataandmodelpredictions.Themodelshavebeencross-validated and produced similar macroscopic traffic performance. The main behavior that should be captured for successful modeling of such a complex corridor includes the anticipative and cooperative driver behavior near merges, lane preference in presence of dedicated lanes, and variations in desired headway along the corridor.


Introduction
Microscopic traffic simulation is a viable and cost effective approach for prospectively evaluating the potential impacts of traffic control strategies and shifts in demand patterns. To achieve this, the microscopic traffic simulation must realistically represent the microscopic level driving behavior [1] and generate realistic macroscopic performance.
The widely accepted commercial microsimulation packages such as Aimsun, VISSIM, and PARAMICS have their unique proprietary car following and lane changing models. Under these simulation frameworks, users can adjust and calibrate the values of behavioral parameters such as reaction time and maximum acceleration to best reproduce realistic driving behavior. Recent works in model calibration have suggested parameter values and proposed optimization techniques to calibrate the parameters [2][3][4][5][6][7][8][9][10][11][12][13][14]. While model calibration seeks the parameter values that best reproduce real-world conditions, the performance of simulation packages is largely constrained by the capability of the embedded core driver behavior models. Many simulation packages do not capture the adaptive, anticipative, and cooperative driver behavior at bottlenecks [15]. As a result, these works have not been very successful in accurately modeling the 2 Journal of Advanced Transportation onset of congestion, capacity drop, queue propagation, and queue dissipation, when interacting bottlenecks are present in freeway corridors, as demonstrated by a simulation study on a 13-mile corridor in Sacramento, California [16,17].
In addition, dedicated lanes generate more complex traffic dynamics on freeway corridors. Friction effect has been observed on freeways with HOV lanes, characterized by the speed reduction on the HOV lane when the adjacent general purpose lane is congested [18][19][20]. Modeling the differences in lane usage and lane change between HOV and single occupancy vehicles is important to reproduce the traffic conditions around HOV lanes. Although different lane change models have been proposed to capture this, large scale simulation and validation of the lane change model have rarely been reported.
The aforementioned problems motivate the development of alternative driving behavior models for car following, merging, lane changing, etc. to better represent the complex traffic dynamics of large scale freeway corridors with dedicated lanes. Such models will allow users to implement their external behavioral models using software development kits to simulate driving behavior with the commercial simulation platforms such as Aimsun and VISSIM or develop complete open simulation tools. In addition, it is almost impractical to develop a model that is able to perfectly reproduce real-world observations. Cross-comparison between models has shown to be beneficial [21], since it makes the analysis more reliable and the results more defensible, giving more confidence to users about the simulation results. Unfortunately, there have not been any well established and cross-validated alternative car following and lane changing models that are applicable for complex freeway corridors.
In response, this paper introduces two such models and demonstrates the results of their validation against realworld data from a 13-mile freeway corridor during an 8 hour period. Although originated from different base models and developed separately by two teams, the two proposed alternatives focus on refined description of merging behavior, lane preference, and cooperative car following behavior in the vicinity of merge and diverge sections. They are validated against empirical loop detector data and against each other. The two models show extremely good agreement between field data and model predictions. Comparison of the two models and validation results give consistent insights into the traffic flow models for complex corridors that can be generalized to other simulation tools. Furthermore, both models performed better than the proprietary driver behavior model. The rest of the paper is organized as follows: the next section presents an overview of two proposed alternative driving behavior models. The following section documents the calibration and cross validation of both models using archived data from a complex freeway corridor. The final section summarizes and highlights the contribution of this paper.

Proposed Driver Behavior Models
Microscopic vehicle behavior and interaction with the nearby vehicles determine overall traffic behavior at the macroscopic level based upon the following factors: maximum acceleration/deceleration and driver behaviors such as preferred headway and response time, gap acceptance threshold for lane changing, and perception advance time period or distance for lane changing. Those parameters directly affect density and delays in the simulation, and thus the overall traffic pattern. Below, we discuss the two alternative models and highlight the main features.
. . Alternative : Driving Behavior Model Based on NGSIM Model. The first proposed driving behavior model (PATH model) is built upon the basic framework of the NGSIM oversaturated flow model proposed by Yeo et al. [16]. Some important extensions and modifications were made in order to depict detailed car following and lane changing behavior that were not represented in the original model.
To determine the trajectory of a vehicle at a microscopic level, it is necessary and sufficient to iteratively determine its location at each time step, which can be realized through a discrete kinematic model if the desired acceleration and current speed are known [22]. The latter is known from the last step calculation. The former is determined by the dynamic interactions with the adjacent vehicles, geometric constraints, and the overall traffic conditions. The dynamic interactions include time/clearance gaps for safety and mobility, and possible scenarios associated with lane changes [22]. Those scenarios are further partitioned into fundamental scenarios (or movement phases) and transitions between them for continuous/smooth speed trajectories: The discretized kinematic model is detailed in [22]. As discussed in [22], Newell's simplified car following model with constraints for safety and acceleration [23] is applied. The Gipps' deceleration component [1,24] is used here to place a safety margin on Newell's simplified equation. Further details such as permissible speed and cross-lane friction on multilane freeways are detailed in [22].
The fundamental scenarios associated with lane changing (LC) are outlined in Figure 1 and discussed in detail by Lu et al. [22].
As detailed in [22]  increasing the vehicle's speed or accessing the high occupancy vehicle (HOV) lane. Detailed mathematical expressions of the gap acceptance models of both types of lane change can be found in Lu et al. [22]. Once a decision for lane change has been made, the subject vehicle will adopt BCF mode prior to changing lanes. This involves accelerating or decelerating in order to align with the gaps available in the adjacent lanes. In addition, the subject vehicle applies YCF mode to cooperate with the leading vehicle in the adjacent lanes that has an intent for a lane change to the current lane of the subject vehicle. The subject vehicle also adopts RCF mode (reduced headway, jam gap, and reaction time) after the lane change maneuver is complete. Similarly, the leading vehicle adopts ACF after the lane change maneuver, which is similar to RCF and involves reduced headway, jam gap, and reaction time. Details of these car following criteria are described in Lu et al. [22].
. . Alternative : Driving Behavior Model Based on LMRS and IDM+. Alternative 2 is an extension of the Lane Change Model with Relaxation and Synchronization (LMRS) [15]. The LMRS is formulated based on lane change desires and provides a flexible structure to incorporate additional desires/incentives due to changes in infrastructure or traffic regulation.
Lane change decisions are made based on comparing lane utility formulated by a combination of desires/incentives. The overall lane change desire is calculated by three incentives that include following the route, gaining speed and keeping right: where is the overall lane change desire from lane to lane j.
, , and represent the incentives for the route, speed, and a bias to the right lane respectively, which can be set to zero for US traffic. The route incentive is based on parameters 0 and 0 that scope the time-space region before the merge/diverge. V is a weighting factor that reflects the relative importance of discretionary incentives and is a function of | |, and this reduces the voluntary incentives as the mandatory incentive is more urgent.
Meaningful lane change desires range from -1 to 1, where negative values suggest that a lane change is not desired. Based on the total lane change desire, different types of lane change behavior can be distinguished by three thresholds: d free , d sync , and d coop , with 0 < d free < d sync < d coop < 1. As shown in Figure 2  When modeling HOV lane operations, is set to a positive constant for HOVs and to negative infinity for single occupant vehicles in the HOV lane, when the HOV lane is active. A similar approach is used to model right lane bias of truck traffic. In addition, a lane change bias correlated to the desired speed is added to reproduce the fact that drivers with higher desired speeds tend to travel on the left lanes of freeways and vice versa [27].

Model Calibration and Cross Validation
The driving behavior model for manually driven vehicles was calibrated for a relatively complex freeway corridor in Sacramento, California.
. . Selected Site. State Route (SR) 99 northbound was selected for model calibration. This section of freeway spans from the Elk Grove Blvd. interchange to the US-50 freeway interchange south of downtown Sacramento, CA. As indicated by the arrows in Figure 3(a), there are 9 interchanges with local arterial streets; 4 partial cloverleaf interchanges, 3 full cloverleaf interchanges, and 2 diamond interchanges with the local arterials. Furthermore, detailed lane configurations are shown in Figure 3(b). The on-ramp merging and weaving sections located at the Sheldon Rd. interchange, the Florin Rd. interchange, as well as the off-ramp at the US-50 freeway interchange, contribute to the morning peak recurrent delay observed in this corridor. This peak period typically begins at 6:30 AM and ends around 10:00 AM, and the morning congestion pattern exhibits the typical peak period when there is high demand for suburb to downtown trips during the morning hours. Also shown in Figure 3(a), there is a wide coverage of detectors throughout the corridor. Detectors with good data quality are shown in blue, those with less acceptable data quality, shown in red, were not used to collect field data for calibration and validation. Currently, the on-ramps are metered using the local traffic responsive demand-capacity approach in order to control the flow of on-ramp traffic and mitigate the peak hour congestion. Lastly, the traffic demand consists of almost exclusively passenger cars, with trucks accounting for only 2% of the overall demand.
. . Calibration Criteria and Procedures. Microscopic simulation models were built in the AIMSUN [28] and MOTUS [26] platforms using the most up to date road geometry, lane configurations, and speed limits for the selected site. Since adopting new driver behavior models in any simulation package requires significant effort to ensure error-free simulation, the AIMSUN and MOTUS simulation packages were selected based on relative ease of implementation. Freeway mainline and on-ramp data obtained from an 8-hour period (4:00 AM to 12:00 PM) on October 6, 2015 were used for the inputs in demand and turning percentages. This 8-hour period encompasses periods prior to, during, and after the peak. For this corridor, 5-minute interval loop detector data for flow were obtained from PeMS [29] and used as the demand input at the most upstream location of the simulation network and the entry points of the on-ramps, and as the turning percentages at any applicable mainline-off-ramp split. Two passenger car equivalence (PCE) was used to represent each truck in the simulation (HCM, 2010). Ramp metering rates and algorithms were obtained from Caltrans District 3 and modeled in AIMSUN simulation via the AIMSUN API (Application Programming Interface) [28]. However, ramp metering operation was not explicitly modeled in MOTUS. As an alternative, flow measured immediately downstream of the ramp meter was used as the on-ramp demand input.
The first proposed driving behavior model (PATH model) was simulated in AIMSUN via the MicroSDK (Micro-Software Development Kit). The latter model was simulated in MOTUS [15,27].
Ten replications of the PATH model and five replications of the MOTUS model with different random number seeds were run in order to calibrate the models to the conditions of October 6, 2015. The predicted flows and speeds at selected locations on the freeway were compared with real traffic measurements at every 5-minute interval to assess the accuracy of the simulation models in representing the observed conditions. The predicted flow is acceptable if on average of all detectors, for at least 85% of all 5-minute time intervals, the flow is to satisfy the condition that ( ) < 5 [30].
The GEH statistic is computed as where ( ): simulated flow during the k-th time interval (veh/hour), ( ): flow measured in the field during the k-th time interval (veh/hour).
In addition, we required that the Mean-Absolute-Percentage-Errors (MAPEs) of flows, defined by equation (3), must be less than 10%.
where is the number of detectors and is the number of time intervals.
, and , are the field observed and simulated data (i.e., flow) of detector obtained during time interval t, respectively.
Furthermore, we required that the Root-Mean Square Error (RMSEs) of flows, defined by equation (4), must be less than 15% [13].
where is the number of detectors and is the number of time intervals.
, and , are the field observed and simulated data (i.e., flow) of detector obtained during time interval t, respectively.
Lastly, the simulated flow-density relationship and queue propagation must be visually acceptable [30]. This indicates that the fundamental diagrams of field observed and simulated flow versus density should resemble similar patterns for key bottlenecks along the corridor, and the contour plots of the field observed and simulated speeds at all 5-minute intervals should exhibit similar trends over the length of the corridor as well as the duration of the study period.
Results from the calibrated PATH and MOTUS models were later compared with results from the calibrated proprietary driver behavior model found in AIMSUN. The study in (Wu et al., 2014) conducted a calibration of the identical SR99 corridor using the proprietary driver behavior found AIMSUN; this study used the same calibration criteria but was limited to calibrating the flow.   All combinations of parameters were attempted and for each set of parameters, 10 replications were simulated to determine if the particular parameter value combination yields the macroscopic traffic pattern that is most similar to the macroscopic traffic pattern observed in the field. The parameter values that were realistic and provided the best fit (based on the calibration criteria) were chosen. More sophisticated parameter search algorithms were avoided in order to ensure reasonable simulation and computation time for this complex corridor. As shown in Table 1, simulation experiments suggest that the following parameters (normally distributed) provide a good fit.
However, poor visibility near on-ramp merging areas in the upstream 2-mile section of the corridor required increasing the reaction time to 1.0 s to better reproduce the field observations. Furthermore, frequent aggressive and last-minute lane changes observed in the field required reducing the reaction time to 0.4s for the short weaving section between the upstream 12th Ave. on-ramp and the downstream US-50 off-ramp in order to replicate the high capacity and relatively uncongested off-ramp bottleneck near the US-50 freeway interchange. More than half of the SR-99 traffic make lane changes to access the US-50 off-ramp during the morning peak.
. . . MOTUS Model Calibration Procedures. In the MOTUS platform, the calibrated parameters involves both the lane change model LMRS and the car following model IDM+ [15,27].
A systematic search algorithm developed by Schakel et al. [15] was used. This algorithm iteratively searches for the optimal parameters set that minimizes the 5-min flow and speed errors between the simulated data and field data. The error is defined as follows: where is the number of deleted vehicles in the simulation. The algorithm first calibrates the parameters related to free-flow traffic conditions and then the parameters corresponding to oversaturated/congested conditions. The parameters corresponding to the free-flow conditions are summarized in the following: The parameters corresponding to the oversaturated/congested conditions are summarized in the following:  This algorithm first searches a wide range of possible parameter values prior to converging to a smaller range of possible parameter values. For each iteration, five replications with different random seeds were conducted. Lastly, the simulated results obtained using the optimal parameter values must satisfy the calibration criteria.
As shown in Table 2, differences can be found in parameter values between the calibrated results and default values. A smaller speed gain in our results implies that simulated US drivers are more sensitive to the speed change in target lane and a low value of d fre suggests that drivers are more likely to change lanes compared to the Dutch traffic represented by the original values. Changes of T min and T max indicate a less centralized distribution of vehicle headway in the simulated corridor and an increased 0 indicates drivers' early preparation prior to exiting at an off-ramp.
Furthermore, a local headway ratio was applied at different segments of the corridor. The local headway ratio was used to increase or reduce vehicle headways at certain locations in order to reproduce characteristics that are unique to certain sections of the corridor. For the segment upstream of the Calvine Rd interchange, the local headway ratio ranges from 1.28 to 1.35, and it increases to 1.45-1.58 at the fourlane segments between the Calvine Rd and the Fruitridge Rd. interchanges. For the remaining downstream section (up to US-50), a smaller value of 0.6 was used for the local headway ratio to maintain the high flow observed in the field. The local headway ratios were fine-tuned individually by comparing the section-based fundamental diagrams. The MOTUS model met the GEH and RMSE criteria at all the detectors along the whole corridor. Although the overall MAPE was less than 10%, the MAPEs at 4 locations (out of 16 detector stations) were slightly higher than 10%. These correspond to the Elk Grove bottleneck, EB Sheldon bottleneck, and the EB Mack bottleneck, where the simulated flow was lower than the flow observed in the field.
In addition, Figure 4 shows a scatter plot of all simulated and field observed flows obtained throughout the corridor and the entire analysis period. Both the PATH model and the MOTUS model simulated flows that strongly correlate with the field observed flows; the MOTUS model performed slightly better, with an R 2 value of 0.9266 instead of an R 2 value of 0.8939 achieved by the PATH model.
Nevertheless, both the PATH model and the MOTUS model performed better than the proprietary driver behavior 8 Journal of Advanced Transportation    The portion of the corridor upstream of the Calvine Rd. onramps (postmile 290.7) was well calibrated; both the GEH criterion and RMSE criterion used in this study were satisfied. However, the portion of the corridor downstream of the Calvine Rd. on-ramps (accounting for major of the corridor length) cannot be well calibrated; the best results yielded the less than satisfactory RMSE values of 15% or higher and resulted in GEH<5 unsatisfied for more than 85% of the time steps.
. . . Flow Characteristics Comparison. Figure 5 summarizes the simulated versus observed speed contour plots. The contour plots show the 5-minute average speeds at the detectors throughout the selected peak period. The first and last hour (low demand and free-flowing) were omitted from  the figures. Both models reproduced the field observed peak duration and the length of queue fairly accurately, with the exception of the most upstream bottleneck at Sheldon Rd., which the PATH model simulated slightly shorter congestion duration and slightly less queue propagation. Additionally, the PATH model simulated faster queue dissipation, as evident in the shorter peak duration shown in Figure 5(b). This could be attributed to the higher acceleration and deceleration rate applied in the PATH model. The lane changing and gap acceptance criteria required larger acceleration and deceleration to prevent simulating low bottleneck flows and severe queue propagation. Such approach ensured accurate representation of the relatively aggressive driver behavior at the beginning of the peak. However, the same criteria could not replicate the slower queue dissipation and longer peak period, due to the temporal variation of driver behavior from the beginning to the end of the peak period. This compromised the accuracy of peak duration but can be adjusted by applying different parameter values and lane changing and gap acceptance criteria for different time periods of the day.
The PATH model was able to replicate the speed profiles of the most downstream bottleneck, which is a complex weaving section with more than 50% off-ramp traffic. In MOTUS, a special local headway was applied here to meet the traffic throughputs but unfortunately the model could not simulate the slight speed reduction at this bottleneck Figure 6 shows the field observed and simulated flowdensity relationships of a four-lane mainline section at the most important bottleneck, the Florin Rd. on-ramps at mile 294, obtained from two sample replications. In both replications, the simulated data provided near-perfect match in the uncongested state, as well as good representation of the congested state. Both models simulated the free-flow speed of 67 miles/hour. However, the PATH model simulated slightly lower than observed maximum capacity. This could possibly be explained by the different methods of generating discretionary lane changes; the PATH model generates more discretionary lane changes in free-flow conditions, which ultimately affects the bottleneck discharge rate. Additionally, the MOTUS model has a right lane bias and prioritizes the freeway mainline, which leads to less efficient merging and more queuing on the on-ramps. This delayed queue dissipation at the freeway bottleneck and resulted in more data points corresponding to traffic conditions with higher densities, as illustrated by the red data points in Figure 6.

Discussion
The simulation results show that the two models are capable of simulating traffic flow characteristics of the complex corridor with varying demand, interacting bottlenecks, and an HOV lane. Although originated from different base models and developed separately by two teams, the two model approaches focus on refined description of merging/diverging behavior, lane preference, and cooperative car following behavior in the vicinity of merge and diverge sections.
Comparison of the two models and validation results gives consistent insights into the traffic flow models for complex corridors that can be generalized to other simulation tools.
The anticipative behavior of the merging vehicles has been captured by many models, where the merger aligns its speed with that of the potential leader in the target lane on the mainline. However, this may not suffice to replicate merging behavior in congested traffic conditions, where the merging vehicle needs to enter the mainline at close spacing. This requires the cooperative (car following) behavior of the vehicles in the mainline to yield a gap for the merging vehicle and an adaptive gap acceptance behavior where driver behavior of which the merger accepts short gaps to change lane and gradually relaxes to the comfortable gaps [15,22].
In weaving bottlenecks, the diverging behavior also plays an important role in the resulting flow features. Existing models often result in short-sighted driver behavior near off-ramps, which deteriorates traffic operations at weaving bottlenecks and leads to more severe congestion than empirical observations. The two models capture the anticipative behavior of drivers by adding an anticipation time and anticipation distance before the off-ramp, implying the early preparation of the exiting maneuvers [15,22]. This appears to be an essential feature in weaving bottlenecks according to our experience.
With Last but not least, both models adopt local behavior parameters (local desired headway in MOTUS and local reaction time in the PATH model) for different bottlenecks to reflect the capacity differences due to road geometrics, lane markings, etc. Note that for most car following models, those parameters affect the resulting capacity and traffic flow stability. It thus influences both the onset and dissipation of congestion.
Overall, both the PATH model and the MOTUS model outperformed the proprietary driver behavior model in AIMSUN. The calibration study by Wu et al. (2014) showed that the proprietary AIMSUN model can only accurately reproduce the observed flows in less than half of the identical corridor.
Despite the successful calibration of the complex freeway, both models present some limitations. As shown in Figure 5(b), the PATH model has difficulties in reproducing the duration of queue dissipation with the current parameter values, which are suitable for modeling queue formation and propagation. This implies that the model cannot depict the inconsistent behavior patterns drivers adopt at the beginning and at the end of a congestion period. To address this problem, we can describe key behavior parameters of a subject driver (e.g., reaction time, maximum acceleration and deceleration) as a function of the time in congested state. The driver would act aggressively at the beginning of the congestion, but become sluggish after spending some time in congestion. This method should improve the capability of the PATH model in reproducing the duration of congestions.
MOTUS was originally developed based on Dutch traffic, where keep-right directive and mainline traffic priority are applied. The absence of keep-right directive and mainline priority in the US results in more efficient merging traffic observed in the field data than in the MOTUS simulation. Although effort had been devoted to mitigating the problem, we still found vehicles queuing on the onramp, which leads to the late dissipation of queue at Florin Rd. bottleneck compared to field observations as seen in Figure 5(c).
Another point of attention is the most downstream part of the network where nearly half of the traffic performed lane changes to access the off-ramp at the US-50 interchange. MOTUS produced more congested traffic than field observations if a special treatment of a small local headway were not predefined. This problem might come from short-sighted synchronizing vehicles that do not anticipate acceptable gaps near the downstream section. An alternative is to set local lane change parameters or no synchronization during low speeds.
Overall, although it is worth noting the differences between the simulated data (obtained from both models) and the field data, the goal of this study is not to precisely calibrate the model for this specific data set, but to cross-validate and recommend two generalizable driving behavior models that can reasonably reproduce the onset of congestion, capacity drop, queue propagation, and queue dissipation at a complex corridor with multiple interacting bottlenecks and managed lanes.

Concluding Remarks and Future Work
This paper presented two driving behavior model alternatives to the driving behavior models in the widely accepted microscopic simulation packages such as AIMSUN, VISSIM, and PARAMICS. These models are intended to reproduce the traffic dynamics of large scale and complex freeway corridors for longer durations, which can be difficult by simply relying on the default models in simulation packages such as AIMSUN, VISSIM, and PARAMICS.
A case study has been conducted using real-world data from an 8 hour period at a complex freeway corridor near Sacramento, California, where the two proposed driver behavior models accurately replicated the locations and throughput of the freeway bottlenecks, as well as the spatial and temporal distributions of speeds. The models have been cross-validated and performed well with similar accuracy. Most importantly, both models outperformed the proprietary driver behavior model commercially available in AIMSUN.
Comparison of the two modeling approaches shows that both models capture the anticipative diverging behavior, lane preference in presence of dedicated lane, cooperative behavior of mainline vehicles to facilitate merging at short spacing at merge and diverge sections, and adaptive desired time headway settings at different road sections along the corridor. The consistent insights into the traffic flow models for complex corridors can be generalized to other simulation tools.
In the future, the validated models will be improved and enhanced to simulate and assess the potential benefit of connected and automated vehicles for real-world freeways.

Data Availability
The data used to support the findings of this study can be found in the Caltrans Performance Measurement System (PeMS) at http://pems.dot.ca.gov/.