Simulation Study of Rear-End Crash Evaluation considering Driver Experience Heterogeneity in the Framework of Three-Phase Traffic Theory

Driving safety is considered to have a strong relationship with traﬃc ﬂow characteristics. However, very few studies have addressed the safety impacts in the three-phase traﬃc theory that has been demonstrated to be an advancement in explaining the empirical features of traﬃc ﬂow. Another important issue aﬀecting safety is driver experience heterogeneity, especially in developing countries experiencing a dramatic growth in the number of novice drivers. Thus, the primary objective of the current study is to develop a microsimulation environment for evaluating safety performance considering the presence of novice drivers in the framework of three-phase theory. First, a car-following model is developed by incorporating human physiological factors into the classical Intelligent Driver Model (IDM). Moreover, a surrogate safety measure based on the integration concept is modiﬁed to evaluate rear-end crashes in terms of probability and severity simultaneously. Based on a vehicle-mounted experiment, the ﬁeld data of car-following behavior are collected by dividing the subjects into a novice group and an experienced group. These data are used to calibrate the proposed car-following model to explain driver experience heterogeneity. The results indicate that our simulation environment is capable of reproducing the three-phase theory, and the changes in the modiﬁed surrogate safety measure are highly correlated with traﬃc phases. We also discover that the presence of novice drivers leads to diﬀerent safety performance outcomes across various traﬃc phases. The eﬀect of driver experience heterogeneity is found to increase the probability of the rear-end crashes as well as the corresponding severity. The results of this study are expected to provide a scientiﬁc understanding of the mechanisms of crash occurrences and to provide application suggestions for improving traﬃc safety performance.


Introduction
e understanding of traffic flow characteristics contributes to the investigation of various applications of transportation engineering [1][2][3]. It has been demonstrated that driving safety has a strong relationship with traffic states [4,5] and can be identified by the significant variance of dynamic features in terms of aggregated indicators, such as occupancy, speed, and speed difference [6][7][8]. Previous studies generally evaluated safety performance in the framework of fundamental diagram that divides traffic into free flow and congestion [9][10][11]. However, this classical method has drawbacks in capturing empirical findings, especially for complicated congestion patterns [12]. Recently, some safety analyses were carried out based on finer classifications of traffic states. For example, Xu [13] evaluated the safety performance under different levels of service (LOS). Furthermore, a crash likelihood indicator was proposed by dividing traffic into five states in accordance with measured occupancies [14]. As expected, these refined considerations on traffic flow states bring about positive impacts on crash evaluations.
Note that the aforementioned works do not have a universal and concrete theoretical framework to explain space-temporal traffic flow features, which may lead to problems in the transferability of the results throughout the world. In recent decades, Kerner proposed the three-phase traffic theory based on systematic empirical investigations [15][16][17] and theoretical studies [18,19]. According to Kerner's concept, a macroscopic traffic flow can be classified into three phases, including free flow (F), synchronized flow (S), and wide moving jam (J) phases. e typical phase transition F ⟶ S indicates the occurrence of a traffic breakdown, and wide moving jams emerge only within the synchronized flow. Both the synchronized flow and wide moving jams correspond to traffic congestion operating with significantly lower speeds. is theory has been tested and reproduced by notable numbers of studies, ranging from empirical investigations [20] to theoretical models [21], and provides instructions on traffic management [22]. Combined with proper crash evaluation method, it has the potential to explore safety performance that is associated with complicated spatiotemporal patterns. For example, driver's overdeceleration under congested phase usually implies dangerous situation which has higher probability of rear-end crash occurrence. Recently, Xu developed an approach to predict crash likelihoods by introducing the three-phase theory [23]. e results suggested that this theory helps to better understand the occurrence of crashes. Kerner's work on traffic flow is expected to have a more widespread application in safety analysis, and it could undergo rapid development by expanding the study with modern methodologies.
Simulation-based approach was originally developed for traffic efficiency studies, and now it has been regarded as a powerful tool for safety evaluation. e simulation-based approach is able to proactively provide deep insight into the details of the interactions among vehicles and has the power to collect and store the data in real time. ese advantages bring about ongoing progress in the field of simulationbased safety evaluation. A large number of studies [24,25] were carried out by integrating simulations with surrogate safety measures, which present the sequence of events with the causative factors of crashes. However, most of the previous simulations focused on a homogeneous driver experience environment, which assumes that all drivers traveling on roads have similar driving experience with respect to safety. It is worth noting that novice drivers do not have adequate driving skills and have problems in recognizing traffic environments [26]. eir significant presence in traffic may have a noticeable impact on safety, especially in developing countries, as a rapidly growing number of people obtain their own vehicles and licenses. Based on this consideration, research is needed to incorporate driver experience heterogeneity into simulation-based safety evaluations.
Rear-end crashes have been found to be a major collision type as frequent decelerations and accelerations occur especially in congested traffic conditions [27]. en, the primary objective of the current study is to develop a microsimulation environment for rear-end crash evaluation in the framework of three-phase traffic theory. e study focuses on car-following scenarios and takes into account the impacts of driver experience heterogeneity. e remainder of the paper proceeds as follows. In the next section, previous studies related to safety analysis across traffic flow states, simulation-based safety evaluation, and driver experience heterogeneity are reviewed. In Section 3, a microsimulation environment consisting of a car-following model and a modified surrogate safety measure is developed. In Section 4, a vehicle-mounted experiment is carried out to provide data for model calibration. Section 5 reports the simulation results with a detailed discussion. Finally, the conclusions of the study are discussed in Section 6.

Safety and Traffic Flow Characteristics.
Traffic characteristics have been found to have a significant correlation with crash occurrences. e core concept of previous studies was to evaluate the probability of crash occurrences by relating traffic flow factors to safety performance measures. Based on this concept, a notable number of statistical models have been carried out. For example, matched case-control logistic regression is a widely used statistical method to identify factors associated with crashes [28]. Moreover, some studies have suggested log linear models and Bayesian statistics to evaluate the likelihood of crashes in response to various traffic flow parameters [29,30]. In recent decades, with vigorous progress in the field of data science, artificial intelligent (AI) methods have been proposed to promote the accuracy of crash risk analysis. AI methods mainly include neural networks [8], random forests [31], support vector machines [32], and deep learning [33]. In comparison, these models aimed to improve prediction accuracy, while the statistical models had better interpretations. e models developed in previous studies generally focused on identifying the relationship between crash and traffic flow parameters. Note that a certain value of a parameter can correspond to various traffic flow states in which drivers have significantly different psychological characteristics and driving behaviors. Such factors have been demonstrated to have great impacts on crash occurrence [14]. Based on this consideration, some studies have investigated the crash mechanism according to the fundamental diagram method, which divides traffic states into free flow and congested flow [9][10][11]. However, this traditional methodology ignores the fact that different dynamic features can exist within the same traffic flow states. Some recent studies have carried out safety evaluations by considering the finer classification of traffic states [13,14,23,34,35]. However, most of the classifications do not have a generally accepted theoretical framework. Recently, the results of several studies showed that the three-phase theory has positive impacts on fitting safety performance to empirical traffic states. For example, Xu et al. developed a crash prediction model by using statistical methods, which link safety to the phases and the phase transitions categorized by three-phase theory [23].
Hu et al. evaluated the safety performance when bikes and buses are present on a motorway based on a cellular automata model that can reproduce the empirical findings of the three-phase theory [34,35]. Drawing from the conclusions of these studies, it can be expected that such a predominant theory has great potential to explore safety performance in a manner considering more details with respect to driving safety.

Safety Evaluation with Microscopic Simulations.
In recent decades, traffic simulation technology has achieved significant advancements in behavior modeling, data processing, and system optimization. Compared with statistical models and AI models, the simulation-based method for safety research has the ability to scan and repeat the process of a crash occurrence and provide scientifically based suggestions regarding how to reduce the likelihood of a crash. To date, commercial software packages have been proposed to evaluate various types of crashes. One of the most frequently used packages was VISSIM. Other packages, such as PARAMICS, AIMSUN, and CORSIM, also have notable applications [36]. By a systematic comparison, a review study concluded that none of the current commercial packages have noticeable superiority to the others in terms of safety evaluations [37]. Moreover, these tools have several weaknesses due to specific issues. First, complicated software operations cannot be started up quickly due to the numerous input settings. Second, there are insufficient options for modeling crash-related factors, as the simulation packages are not specially designed for traffic safety.
ird, much coding time is needed for integration with surrogate safety measures and for processing output data.
Given that commercial simulation packages are still not targeted for safety evaluations, some recent studies began to develop original simulation frameworks regarding the specific factors of crash occurrences. Some of these simulation models were concerned about the "less-than-perfect" driving strategy associated with human drivers [38,39]. More specifically, factors regarding unsafe driving behavior, such as the inattention interval, variable reaction times, and perception limitations, were incorporated into the behavioral modeling of the simulations. On the other hand, some simulation models have addressed reproducing various crash-prone situations and have provided detailed insight into the mechanism of crash occurrences [34,35,40]. For example, Hou. et al. [40] proposed a holistic framework for safety evaluations based on cellular automata. e simulation model took into account several safety-related factors, including lane configurations and road surfaces in the work zone, adverse weather, and the effect of speed limits. e results suggested that the proposed framework was particularly suitable for a comprehensive evaluation of safety performance under specific driving situations.

Driver Experience Heterogeneity and Safety.
Another key issue with respect to traffic safety is driver experience heterogeneity, which is especially prevalent in developing countries due to the notable growth in the number of novice drivers [41]. ere has been increased interest in the study of driver experience with respect to safety. It was found that the major causes of crashes for novice drivers included inadequate speed control, overreaction delay, and inappropriate attention allocation, all of which could be attributed to the lack of driving experience [42]. Some previous studies carried out experiments to record crash-relevant driving parameter data and compared the difference in safety performance between novice drivers and experienced drivers with statistical methods [43][44][45]. However, these results were drawn by relating crash occurrences to several driving parameters without considering their differences across various compositions of driver experience. In other words, the studies ignored the fact that drivers behave differently when interacting with various driver populations, and this difference could have unequal impacts under different traffic flow states.

Summary of the Review.
According to our review of previous works, very few studies have focused on applying the simulation-based method to evaluate safety performance in the three-phase traffic theory, and even fewer have also focused on the impacts of driver experience. To this end, it is necessary to develop a systematic simulation framework for evaluating crashes considering driver experience heterogeneity under various traffic flow phases and that is able to benefit safety improvements in terms of both theoretical research and practical applications.

Car-Following Model.
Notable number of studies carried out car-following models to explain empirical findings in the three-phase theory. e modeling methodology generally includes spatially continuous model and cellular automata model. However, these models did not consider human driving characteristics or suffered from over-rigid modeling framework to avoid crash risk. For example, a classic carfollowing model named Kerner-Klenov model [18] has complicated update rule of vehicle motion to reproduce the theory from the view of traffic physics rather than human factors. Most cellular automata models [46,47] intend to avoid crash by incorporating rigid safety conditions into the models, which deviates from real driving behavior especially for safety evaluation. en, this study proposes a new model by introducing typical human factors that are able to explore the relationship between rear-end crash and driver behavior. Moreover, the parameters in the model are expected to have significant meaning in explaining driver experience heterogeneity, and they can be estimated by our experimental data.
Previous studies have demonstrated that when an "indifference zone" is incorporated into a modeling framework for car-following behavior, the resulting model is capable of reproducing empirical findings in the three-phase theory [48]. More specifically, this zone reflects human driver's psychological characteristics, suggesting that while driving vehicle in car-following scenarios, a driver tends to maintain spacing or time headway within a satisfactory range instead of achieving an optimal value. e nonoptimal driving strategy has the potential to capture near accident scenarios as it reproduces unsafe driving associated with misjudgment and over reaction delay especially in complicated traffic conditions. Based on this consideration, we propose an improved car-following model by incorporating humans' unconscious reaction feature into the classical Intelligent Driver Model (IDM), which can lead to the occurrence of the "indifference zone." e proposed unconscious reaction originates from Wiedemann's action point paradigm [49]. Wiedemann suggested that car-following has four reaction regimes identified by action point thresholds. As shown in Figure 1, AX is the desired distance between two successive stationary vehicles, which consists of the length of the leading vehicle and the front-to-rear distance. ABX defines the threshold of the minimum desired spacing with respect to traveling speed. is indicates that the driver realizes the following distance too small and consequently reacts with deceleration. SDX is the threshold of the maximum accepted spacing, which is 1.25-3.0 times ABX. is indicates that a driver perceives themself to be too far away from the leading vehicle and hence decides to accelerate. CLDV or SDV is the threshold for recognizing the speed difference when approaching the leading vehicle, whereas OPDV is the threshold for recognizing the speed difference during an opening process. ese thresholds indicate the action points where drivers react to changes in driving conditions and make the corresponding changes in acceleration.
As seen in Figure 1, within the unconscious regime enclosed by ABX, SDX, CLDV, and OPDV, drivers tend to unconsciously maintain a speed difference and spacing within a satisfactory range rather than maintaining the optimum value. Car-following spirals can be identified due to these nonoptimal expectations [39], resulting in the speed difference and spacing oscillating around the "optimal" state represented by a desired spacing and null speed difference. In this study, we propose an acceleration model to reproduce the car-following spirals based on the dynamic features of vehicle motion. Figure 2 shows a typical car-following spiral drawn from our empirical data. Two subprocesses can be identified while undergoing a complete car-following spiral. It includes a deceleration subprocess (A-B-C) and an acceleration subprocess (C-D-A). During subprocess A-B-C, for example, follower driver n starts to decelerate at point A with random deceleration a n as the driver perceives slower leader n-1.
en, the distance between them gradually drops down and reaches an identified ABX value at point B associated with a null speed difference. However, the follower continues to keep decelerating until the speed distance reaches an OPDV value at point C. According to the basic principle of dynamics, the acceleration rates can be determined as follows.
For the subprocess A-B, where Δv n � v n−1 − v n is the speed difference between vehicle n and its leading vehicle and d n is the net distance (gap) between the two vehicles. s n is the distance traveled by vehicle n during subprocess A-B, and A n−1 is the initial acceleration rate of the leader at A. ABX is the identified action point at point A. A previous study found that the acceleration rate within the unconscious regime did not exceed the range of −0.6 m/s 2 and 0.6 m/s 2 and varied according to a certain probability [50,51]. en, the acceleration rate is modified as follows: where NRND is a normally distributed random parameter [51]. e acceleration rates during other subprocesses can be determined as above. ey are presented in (5)-(7). For the subprocess B-C, For the subprocess C-D, For the subprocess D-A, where OPDV, SDX, and CLDV are the identified action points at points B, C, and D. e other variables have the same meaning as in (1)-(4). When a vehicle passes any of the action point thresholds enclosing the unconscious regime, drivers consciously make changes in acceleration in response to variations in the driving environment. In this study, the IDM is proposed to capture the dynamic features within conscious reaction regimes. e model is calculated by the following equations: where a is the maximum acceleration, τ is the driver's reaction time, and S des is the desired safety gap, which is written as follows: where S 0 is the jam distance when the vehicle stops, T is the desired safety time gap, and b is the desired deceleration.

Surrogate Safety
Measure. Surrogate safety measures can be divided into two categories, including time-related indicator and energy-related indicator. For example, the time to collision (TTC) has been considered a major surrogate safety measure in crash evaluations. Previous studies have demonstrated that the TTC is able to indicate the probability of a crash but cannot describe the severity of a crash as speed is not involved in the measure [52]. In comparison, the energy-related indicator is developed by taking into account the kinetics (usually represented by speed or speed difference) to describe the severity of a crash. e estimation of such indicators usually needs to identify accurate trajectories with respect to conflicting vehicles. However, it lacks the associated data considering driver experience heterogeneity across various traffic phases. e simulation-based approach has advantages for studies using surrogate safety measures, as it is much less dependent on actual crash data, and intravehicle interactions of high resolution are allowed to be reproduced and collected. erefore, this study developed a modified surrogate safety measure to evaluate rear-end crashes by integrating it with the proposed simulation model. More specifically, the measure comprehensively considers the probability of a rear-end crash and the associated severity based on the deceleration rate to avoid crash (DRAC). e DRAC is calculated as follows: where DRAC n (t) and TTC n (t) are the DRAC and the time to collision (TTC) of a following vehicle n at time t. v n (t) and v n−1 (t) correspond to the speeds of vehicle n and leader n-1, respectively. x n (t) and x n−1 (t) show the positions of vehicle n and leader n-1, respectively. It is worth mentioning that, as shown in (11), the denominator (TTC) implies the crash likelihood, while the numerator (speed difference) indicates the severity of the rear-end crash. According to previous studies, the smaller the TTC value, the higher the probability of a collision occurrence. Moreover, the greater the speed difference, the greater the energy generated by a vehicle collision, and the higher the severity of the crash. us, it is encouraging that DRAC has the ability to simultaneously evaluate traffic safety performance in terms of probability and severity.

Journal of Advanced Transportation 5
A representative previous study [53] suggested that the integration of surrogate safety measure over space and time is able to fully utilize the advantages of simulation-based method. Driven by this consideration, we propose an integrated DRAC (IDRAC) through a generalization of the concept of time integrated time to collision (TIT) [53]. is surrogate safety measure focuses on general performance, which is calculated by the sum of the DRAC values that are higher than the critical safety threshold DRAC * within the range of the investigation road section over time. e indicator can be determined as follows: where N is the sample size of drivers and T is the time duration of observation or simulation. Furthermore, considering the fact that a simulation time is discretely driven, the time-discrete version of IDRAC is modified by the following equation: To make the simulation results comparable with different sample sizes and time durations, IDRAC needs to be standardized to obtain the average IDRAC per vehicle per unit time: e combination of IDRAC and traffic simulation models hold promise for the safety evaluation as a DRAC value is standardized over a specific time horizon T for a certain roadway segment on which N vehicles run. Moreover, the probability of crashes as well as the associated severity can be evaluated by the modified indicator, simultaneously. Note that the critical safety threshold DRAC * is related to factors such as driver characteristics, vehicle performance, traffic flow conditions, and road conditions. DRAC * usually ranges from 1 m/s 2 to 3.5 m/s 2 . Because the proposed car-following does not consider some factors that may cause traffic accidents, such as driver distractions, operation errors, and illegal driving, a smaller value of 1.5 m/s 2 for the critical safety threshold is selected in the current study as a criterion for judging whether a traffic conflict incident is occurring.

Data Collection Environment.
In this study, car-following data were measured on an eight-lane highway section over a length of 3.8 km in the city of Nanjing, China. e highway plays an important role in serving major traffic demand in the eastern area of Nanjing. More specifically, an average of more than 35,000 vehicles traveled through the experimental section every day in 2018. e speed limit is set at 80 km/h. Furthermore, the experimental section has the following characteristics with respect to geometric design, traffic states and environmental conditions: (1) the traffic in both directions is separated by a physical central median, (2) the road pavement of the section is in good condition, (3) the traffic is mainly composed of passenger vehicles, and (4) the intensity of land-use development on either side of the section is limited.

Device System.
To capture drivers' car-following characteristics in a natural traffic environment, a dedicated data collection system that uses an instrumented vehicle mounted with a GPS device, a laser rangefinder, and a microcomputer is developed. Figure 3 provides an overview of the data collection system and the instrumented vehicle. More specifically, GPS is used to track the instrumented vehicle's latitude and longitude coordinates, which can be converted to trajectory data. e laser rangefinder aims to determine the distance by measuring from the rear of the leading vehicle to the head of the instrumented vehicle. e laser rangefinder covers a range of approximately 100 m. e microcomputer can synchronize the GPS data and spacing data and store them in a format that can be read by the dedicated software. For the data validity test, the experimental scenario is recorded with a digital camera.
To ensure that the laser rangefinder's emission beam can reach the rear plate of the leading vehicle, it is mounted on the platform behind the windshield. e GPS device is fixed inside the vehicle. e microcomputer lies on the passenger seat and always maintains wired connections to the laser rangefinder and the GPS device. e digital camera is placed on the backrest of the rear seat to monitor the driver's behavior and the forward roadway. e instrumented vehicle (Figure 3(b)) equipped with the data collection system is a Hyundai Sonata, which has an engine displacement of 2.0 liters.
is vehicle type has a dynamic performance similar to that of most vehicles traveling on the experimental section.

Participants.
A total of 41 participants, including 20 experienced drivers and 21 novice drivers, were recruited in this study. ey mainly differed in, first, driving experience (years) and, second, cumulative mileage (km). Drivers with less than 2 years of driving experience and no more than 12,000 km of cumulative mileage were categorized as novice drivers and otherwise categorized as experienced drivers. e results of the t-test indicated that there was no significant difference in gender between the two driver population groups. Male participants accounted for 71.1% and 69.1% of the drivers in the two groups. e average ages of the experienced and novice participants were 38.7 years old and 28.5 years old, respectively.

Procedure.
e car-following data used in this study were collected under good weather conditions to avoid the impact of weather factors on the driving environment. e duration of the experiment spanned typical time periods, including morning rush hours, off-rush hours, and evening rush hours on various weekdays. During an experiment tour, the participants drove the instrumented vehicle in a natural driving environment without any interference from our research team. More specifically, every participants were employed to complete 2 tours which covered rush peak and off-rush peak. Each tour experienced a duration of 1 hour consisted of 15 minutes to get familiar with the instrumented vehicle and 45 minutes to drive as usual for data collection. Both the laser rangefinder and the GPS worked at a frequency of 1 Hz.
It should be noted that the measured following distance sometimes missed as the leading vehicle steered or when some random factors interrupted the emission beam. Locally weighted regression (LWR) is a statistical learning method that has been used for capturing missing data [54]. In this study, LWR is then used to yield the leading vehicles' trajectory profiles. e following distance is calculated by identifying the distance between the trajectory of the instrumented vehicle and that of the leading vehicle.

Model Calibration.
Empirical car-following spirals are taken from the collected data, and candidate action points are identified according to Brackstone's concept [55]. Furthermore, the perpetual thresholds including ABX, SDX, CLDV, and OPDV are fitted to the identified 95% values of the candidate action points by relating them to the speed or spacing. In this study, a linear relationship was used, which has been demonstrated by previous works to show good consistency with the data [56]. More specifically, the fitted thresholds for experienced drivers and novice drivers are shown as follows. For the experienced drivers, ln(OPDV) � −1.029 + 0.043 × v, ln(CLDV) � −0.725 + 0.055 × v, For the novice drivers, where v is the travel speed. e R-square values for all of the fitted functions above are over 0.8, indicating good performance for determining the relationship between action points and driving parameters. It is worth noting that the fitted results imply that the "indifference zone" for experienced drivers has a smaller area, which may lead to improvements in safety performance, especially in the traffic oscillation under congested phases. e detailed discussion about the safety impacts of driver experience heterogeneity will be presented in Section 5.4.
By means of the genetic algorithm, the best parameter settings of the proposed car-following model for the collected data are found, as shown in Table 1. As seen, experienced drivers prefer a larger desired speed and maximum acceleration and keep a smaller desired time gap and stop distance. In general, they tended to implement relatively aggressive driving manners compared with novice drivers. is may originate from their skillful driving experience in terms of perceiving and reacting to variations in the traffic environment. eoretically, these differences in driving parameters indicate that driver experience heterogeneity could have significant impacts on traffic characteristics and safety performance.
In the next sections, periodic boundary conditions are selected for numerical simulations. e length of the road section is 2000 m, the vehicle length is 5 m, and the simulation step is 0.1 s. e default initial traffic consists of all novice drivers. A simulation first runs with a warm-up duration of 3600 s to eliminate the transient effects, and then the relevant traffic data are collected after the warm-up period.

Basic Traffic Flow Characteristics.
e following two initial conditions are implemented to explore the basic traffic characteristics of the proposed model: (1) vehicles are homogeneously distributed on the test road and (2) vehicles are distributed a mega-jam form. Moreover, the simulated traffic consisted 100% novice drivers as their safety performance constitutes a major concern of this study. e fundamental diagram is presented in Figure 4, where the data points are the aggregations over 5 min flow rate data collected from the virtual detectors.
ree various traffic phases can be identified, which is consistent with the three-phase traffic theory. When the occupancy rate is low, the traffic operates under free flow phase in which drivers drive steadily at their own desired speed. e flow rate increases approximately linearly with the increase in the density. When the density reaches k 1 < k < k c , the free flow is in a metastable state, indicating that the phase transition F ⟶ S may occur with probability as the internal disturbance of the traffic may exceed a certain extent. e critical occupancy k c corresponds to the maximum flow rate that the free flow can reach.
When the density reaches k c < k < k 2 , synchronized flow emerges. Due to the decrease in the distance between vehicles, vehicle motion is restricted by the associated leading vehicle, and the traveling speed is significantly lower than drivers' expectation. e flow rate decreases with the increasing occupancy. In this phase, car-following spirals can be identified within the corresponding "indifference zone." Such unconscious driving stabilizes traffic because of the absence of overreaction to small speed difference between successive vehicles. en, local fluctuations will not be enhanced, and no traffic jams occur within the traffic flow. However, if the vehicles are initially distributed in a megajam form, the jam does not dissipate over time.
When the density further exceeds k 2 , jams spontaneously occur within the synchronized flow. Within this density range, fluctuations caused by a vehicle can trigger the followers to take overdeceleration, which causes the amplitude of the fluctuation to intensify while propagating upstream of the traffic. e internal density of a jam is extremely high, and the speed nearly decreases to a null value. e downstream boundary of a jam propagates upstream of the traffic with a rough constant speed, and there coexists free flow and synchronized flow between jams.

Evaluating Safety Performance in the ree-Phase eory.
e simulation environment developed in the current study is used to evaluate the safety performance with respect to each traffic phase in the framework of three-phase theory. e occupancy is taken as the measure to indicate the traffic phase, and it varies between 0.0 and 0.6, which covers the entire phase spectrum. Fifteen independent simulations are carried out, and the corresponding variation trends of IDRAC are shown in Figure 5.
As seen in the simulation results, when the occupancy of traffic flow keeps in low range ( < 0.16), the IDRAC value is close to 0, suggesting that the crash probability and severity are very small. As the occupancy continues to increase, the IDRAC increases sharply and reaches a peak near the critical occupancy (an occupancy at which the capacity occurs). is can be explained by the fact that small spacing with relatively high speed can lead to crashes with high uncertainty. When the occupancy further increases, IDRAC significantly decreases and fluctuates within a certain range, indicating an improvement in the safety performance. As the occupancy increases beyond a certain threshold (approximately 0.36), IDRAC increases again and then gradually decreases. Figure 6 is a scatter plot of the average IDRAC values associated with different occupancies. According to the three-phase traffic flow theory, we propose the following statements: (i) When the average occupancy is low, the traffic state corresponds to the free flow phase, in which vehicles can run at their own desired speed and the distances between pairs of successive vehicles are large. e relaxed driving environment leads to small IDRAC values. e result indicates that the probability of the occurrence of a rear-end crash as well as the potential severity is low. (ii) As the occupancy gradually increases to the vicinity of the critical occupancy, the traffic flow is in the metastable state in which the transition from free flow to synchronized flow occurs with probability. Aggressive behaviors such as abrupt acceleration and overdeceleration would cause fluctuation, which can be intensified when propagating along a vehicle platoon. Compared with the free flow phase, crash as well as the potential severity are substantially reduced. (iv) As the formation of wide moving jams is usually accompanied by stop-and-go traffic, the speed difference between pairs of successive vehicles sharply increases, which contributes to an increase in IDRAC, suggesting a higher crash probability and severity.

Evaluating the Impacts of Driver-Experienced
Heterogeneity. e proposed car-following model is calibrated to fit the measured data associated with novice drivers and experienced drivers. us, our simulation is capable of capturing driver experience heterogeneity for rear-end crash evaluation. e results in the above sections are achieved by assuming that traffic is consisted of 100% novice drivers. However, in real situations, both the novice and the experienced occupy significant proportions in traffic. In this section, the scenario of 100% novice drivers is set to be the base condition, and different proportions of experienced drivers are considered to study the impacts of driver experience heterogeneity on traffic safety performance. More specifically, the simulations are performed by increasing the proportion of experienced drivers from 0% to 100% with a step size of 10%, and a range of traffic occupancies between 0.0 and 0.6 is also applied as previously mentioned. e results are shown in Figure 7.
With the increase in the proportion of experienced drivers in the traffic flow, the values of IDRAC in the free flow state do not change significantly. In the metastable state, the difference in IDRAC values from the base condition gradually increases until the experienced drivers account for 60% of the drivers on the road, and then it decreases. When the occupancy reaches the levels of the congested phases (i.e., including synchronized flow and wide moving jam phases), this difference has a further downward trend. According to these simulated results, the following can be concluded .
(i) Due to the large spacing between successive vehicles in free flow, there is ample driving space, and the crash is not sensitive to the change in the compositions of driver experience. (ii) In the metastable free flow, traffic breakdown can be trigged by small disturbance such as overdeceleration. e ever-increasing traffic occupancy leads to a significant shrink of the spacing between vehicles, whereas the traveling speed is still closed to the free flow speed. According to the calibration results of the simulation model parameters, experienced drivers tend to drive with a more aggressive style in terms of smaller spacings and faster speeds, consequently resulting in rearend crashes with higher probability and severity. e result is also consistent with some previous studies [57,58] which argued that aggressive driving had negative impacts on driving safety. It is worth mentioning that the maximum value of IDRAC occurs when the percentage of experienced drivers reaches 60% (see Table 2). is implies that the higher the heterogeneity of the traffic composition, the worse the safety performance. (iii) In the congested phases, the traveling speed drops dramatically, indicating that a driver's sensitivity to changes in the surrounding environment plays a major role in safety performance. is sensitivity is represented by the area of the "indifference zone" enclosed by the perpetual thresholds, as a follower in the zone is not sensitive to the speed difference of the leader. en, the larger the area is, the less sensitive the follower is to the stimulus in terms of speed difference. erefore, according to the fitted parameters, the presence of novice drivers brings about negative impacts on driving safety, as they have a larger "indifference zone" area and higher response delays. (iv) As can be drawn from the results above, the occurrence of rear-end crash is explained by a competition between headway and sensitivity to speed difference. Smaller headway is more likely to bring about a tendency towards a rear-end crash in the metastable flow, whereas less sensitivity to speed variation plays a more important role in such tendency when driving under the congested phases.

Conclusion
e current study evaluates rear-end crash risk by developing a microsimulation in the framework of three-phase theory. e safety impact of driver experience heterogeneity is also considered as a major reason in the crash evaluation. A car-following model is developed by incorporating the classical action point paradigm and the IDM. is modeling methodology can explain human drivers' psychological factors for safety assessments. Field data collection is carried out to collect car-following data with respect to novice drivers and experienced drivers, respectively. e data is used to calibrate the proposed car-following model. A surrogate safety measure is proposed by the concept of integrating an individual's DRAC over spacing and time, which is suitable for simulation analysis. Moreover, the measure can simultaneously quantify the probability of a rear-end crash as well as the potential severity. e simulation results show that the proposed carfollowing model is capable of simulating typical traffic dynamics in the framework of three-phase theory. More specifically, the model reproduces the three phases, including free flow (F), synchronized flow (S), and wide moving jams (J). ese complicated empirical findings are captured by introducing the "indifference zone" into the modeling of car-following behavior. It is worth mentioning that the "indifference zone" has correlation with human drivers' nonoptimal driving strategies, which can provide better performance in catching near accident scenarios.
In the safety assessment, traffic phase transitions lead to changes in driving behavior and interactions among vehicles, which causes the proposed IDRAC value to change with the corresponding traffic phase.
is implies that the probability of a crash and the potential severity are related to the traffic flow phase. More specifically, safety performance achieves the highest level in free flow. Nevertheless, the probability and severity of crashes sharply increase in the metastable state, where the transition from free flow to synchronous flow has a probability of occurring, and decrease when traffic completely evolves into a synchronized flow. However, stop-and-go traffic in wide moving jams again brings about negative impacts on safety performance.
Driver experience heterogeneity has a profound impact on rear-end crashes. e presence of novice drivers affects the safety performance in two ways. On the one hand, novice drivers have drawbacks in responding to risk situations in terms of a larger "indifference zone" and reaction time, and this brings about a higher value of IDRAC under the synchronized phase and the jam phase. On the other hand, novice drivers tend to maintain a larger gap and slower speeds. In the face of metastable flow, such drivers can stabilize traffic due to larger safety margins, reducing the crash probability and the potential severity. Moreover, the simulation results indicate that as the composition of driver experience becomes more heterogeneous, the safety performance deteriorates associated with a higher value of IDRAC. e current study makes efforts to advance a comprehensive methodology for traffic safety evaluation through simulation. However, some problems need to be considered in future works. First, lane-changing behavior could be introduced into the simulation to analyze broader and more complicated situations, such as merging and diverging bottlenecks. Second, it is expected that the simulation performance would be improved by expanding the amount of driving behavior data with a naturalistic collection method. Finally, traffic control measures such as variable speed limits have correlation with crash risk and then their safety impacts in the three-phase theory could be evaluated by an integration with the proposed simulation environment. We recommend that future studies address these issues.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.