Evaluation of Roundabout Safety Performance through Surrogate Safety Measures from Microsimulation

The paper presents a microsimulation-based approach for roundabout safety performance evaluation. Based on a sample of Slovenian roundabouts, the vehicle trajectories exported from AIMSUN and VISSIM were used to estimate traffic conflicts using the Surrogate Safety Assessment Model (SSAM). AIMSUN and VISSIM were calibrated for single-lane, double-lane and turbo roundabouts using the corresponding empirical capacity function which included critical and follow-up headways estimated through meta-analysis. Based on calibration of the microsimulation models, a crash prediction model from simulated peak hour conflicts for a sample of Slovenian roundabouts was developed. A generalized linear model framework was used to estimate the prediction model based on field collected crash data for 26 existing roundabouts across the country. Peak hour traffic distribution was simulated with AIMSUN, and peak hour conflicts were then estimated with the SSAM applying the filters identified by calibrating AIMSUN and VISSIM. The crashprediction model was based on the assumption that the crashes peryear are a function of peak hour conflicts, the ratio of peak hour traffic volume to average daily traffic volume and the roundabout outer diameter. Goodness-of-fit criteria highlighted how well the model fitted the set of observations also better than the SSAM predictive model. The results highlighted that the safety assessment of any road unit may rely on surrogate safety measures, but it strongly depends on microscopic traffic simulation model used.


Introduction
The concept of road safety refers to a property of some elements of the real world which are called units: a road segment, an intersection, a vehicle, or a person.According to Hauer [1], a key characteristic of a unit is that it may be involved in crashes and crashes may occur on it.Many research efforts have been devoted to the study of the relationship between crash history and road design/traffic variables using statistical models.Since regression analysis is used to develop crash prediction models, complete and updated crash databases must be available.Differently from statistical approaches to road crash data analysis, traffic conflict technique allows studying the road situations and observing traffic conflicts [2].In recent years, the traffic conflict techniques have been incorporated into traffic simulation models, thus providing considerable potential for proactive safety analysis [3].Simulation-based surrogate safety measures have also been the subject of recent research [4]; they have been applied to evaluate the safety performance of any road unit using simulated vehicle trajectories exported from microscopic traffic simulation models.In this regard, the Surrogate Safety Assessment Model (SSAM) software processes trajectory outputs provided by traffic microsimulation models, identifies traffic conflict events by analysing vehicleto-vehicle interactions, and categorizes the conflict events by type; the SSAM evaluates the surrogate safety measures for pairs of vehicles involved in a traffic conflict [5].A simulationbased approach to assess road safety performance through the surrogate measures of safety will depend largely on the microscopic traffic simulation model which is applied.The trajectory files provided by microsimulation also depend on how the road unit is modelled and simulated.In view of the well-known potentialities of microsimulation software packages and growing attention of transportation engineers in their use, calibration of these models should be carefully considered so as not to compromise their ability to reproduce the real-world traffic conflicts.
Starting from these considerations, this paper describes a microsimulation-based approach for roundabout safety performance evaluation.The specific objective of the research is to show the methodological path used to develop a crash prediction model based on simulated conflicts.For these purposes, estimation of traffic conflicts by the SSAM software is done for each roundabout of the Slovenian sample using trajectory files generated by AIMSUN [6] and VISSIM [7], after calibration of the two types of software.Calibration is done for each type of roundabout (i.e., the single-lane roundabout, double-lane roundabout and turbo roundabout) using the corresponding empirical capacity function which incorporated the critical and follow-up headways estimated by meta-analysis [8].The simulated vehicle trajectories of the roundabouts of the Slovenian sample were exported from AIMSUN and VISSIM and were used to develop a conflict analysis through the SSAM software.The idea behind the proposed approach for roundabout safety performance evaluation was to estimate the surrogate measures of safety based on a suitable setting of the SSAM filters so that the simulated outputs from AIMSUN and VISSIM had a comparable level [9].Then, a generalized linear model framework was used to estimate a prediction model based on crash data collected at Slovenian roundabouts.Since technical literature still presents few studies which focus on the relationship between crashes and simulated traffic conflicts especially at roundabouts, there is a gap in the current literature that this paper aims to address.
The main framework of the paper is organized as follows.After a literature review on the area of road safety evaluation based on traffic conflicts, also through microscopic traffic simulation models, the next sections present the crash dataset for the Slovenian sample of roundabouts, the method proposed to calibrate the microscopic traffic simulation models used, and the calculation of surrogate safety measure from microsimulation.Then, the development of a crash prediction model from simulated peak hour conflicts is described for the sample of 26 Slovenian roundabouts, and the results of validation of the proposed model are presented.Conclusions of the research and future developments of the work are explored in the concluding section.

Literature Review
Many safety studies using microscopic traffic simulation models rely on surrogate safety measures, which have been introduced to assess the safety performance of roads and intersections without waiting for a statistically significant number of real crashes to occur [10].Different measures have been proposed; the most popular for simulation includes time-to-collision, stopping distance index, modified time-tocollision, vehicle speeds, and headways [5].The surrogate safety measures are based on the identification, classification, and evaluation of conflict events that occur during microsimulation.As introduced above, the Surrogate Safety Assessment Model (SSAM) reads trajectories files exported from microscopic simulation models and calculates the surrogate safety measures.This approach eliminates the subjectivity associated with the conventional conflict analysis technique and makes it possible to assess the safety performance of a road infrastructure under a controlled environment, before a crash occurs.Since a comprehensive review of the stateof-the-art in the area of road safety simulation models is beyond the research objectives, without being exhaustive we remember a recent study that analysed the geometric design of passing lanes and evaluated their optimal length using VISSIM and the SSAM software [11].The results highlighted not only the fundamental role of geometric design in the safety performance of the 2+1 short passing lane, but also the use of simulated traffic conflicts being a promising approach for road safety performance analysis.Wang et al. [12] used AIMSUN to simulate driver violating behaviours through user-defined add-ons, proposed a method for analysing collision risk of various driver violating behaviours, and examined the impact on motorway safety.The authors also highlighted the lack of violating behaviours in existing software that has made time-to-collision of stopping-sightdistance difficult to evaluate in current simulation environments.Kuang et al. [13] also verified whether or not the incorporation of the driver's perception-reaction time could improve the performance of a surrogate safety measure.To this end, they proposed the modified surrogate indicators by considering the driver's perception-reaction time.Based on collected data on motorways, calibration of the VISSIM by the error tests and trajectory comparison were done; the performances of the modified surrogate indicators were then evaluated using crash data.Huang et al. [14] classified traffic conflicts generated by the SSAM using vehicle trajectories from simulation; they derived reasonable estimates for field measured traffic conflicts at signalized intersections.Essa and Sayed [15] also used the SSAM to estimate surrogate safety measures at signalized intersections in urban area; they investigated the transferability of VISSIM calibrated parameters for safety analysis between different sites.The results confirmed that the use of simulation models to evaluate road safety without proper calibration should be avoided, and more work is needed to confirm that simulated conflicts represent safety measures beyond what can be expected from exposure.Vasconcelos et al. [16] evaluated the potential of the SSAM approach to assess the safety performance at urban intersections and roundabouts.Model validation was accomplished by comparing the number of conflicts obtained with the SSAM both with the number of crashes predicted by analytic models and with conflicts observed on existing intersections.Recently, Pratelli et al. [17] presented a procedure for analysing safety and operational improvements from conversion of traffic circles to modern roundabouts using AIMSUN and the SSAM software.However, despite the encouraging results, further case studies were needed to validate the proposed method.Despite some limitations related to the nature of the traffic microsimulation models used in the aforementioned researches, the SSAM analysis resulted in a promising approach to assess the safety of new intersection layouts.
A microsimulation-based approach could be also conducted to estimate the safety impact of autonomous vehicles (AVs) on-road traffic, since AV technology has advanced in recent years with some automated features already available in vehicles on the market.Deluka Tibljaš et al. [18] have already analysed safety performances at roundabouts where different numbers of Conventional Vehicles (CVs) and AVs coexist in traffic.The simulations done with VISSIM and the SSAM gave some highlights on how the introduction of AVs could change the operational and safety parameters at roundabouts.Another recent research focuses on the relationship between crashes and conflicts predicted by simulation models.Saleem et al. [3] developed crash prediction models from simulated peak hour conflicts for a group of urban signalized intersections and evaluated their predictive capabilities.Some case studies simulated with VISSIM and Paramics demonstrated the capability of microsimulation for estimating safety performance.Saulino et al. [19] investigated the use of simulated conflicts as possible surrogate safety measures for roundabouts, for which it has proven difficult to relate crashes to geometric characteristics.They applied microsimulation to estimate the number of peak hour conflicts for roundabout entries using a database of US roundabouts.Their results suggested that simulated conflicts can be considered as a surrogate measure for crashes at roundabouts after a proper calibration.Nevertheless, it should be noted that alternative methods have been developed and applied for safety evaluation at roundabouts.It is possible to refer to Pilko et al. [20] for a new analytical approach that used multicriteria and simultaneous multiobjective optimization of geometric design, efficiency, and safety for a sample of Croatian singlelane roundabouts, while Hatami and Aghayan [21] investigated different types of roundabout layouts and analysed the effects of radius and speed variations on the roundabout performance through several scenarios defined in AIMSUN.However, it should be noted that a few studies on the use of surrogate safety measures from microsimulation were based on field data or have calibrated conflicts for a specific road or intersection.Although a large number of practitioners and transportation engineers during the last decade have been using traffic microsimulation in lots of practical applications, technical literature still presents few studies which focus on the relationship between crashes and simulated traffic conflicts especially at roundabouts.Thus, there is a real knowledge gap in the current literature on estimation of surrogate safety measures at roundabouts that needs to be filled.

Materials and Methods
3.1.Crash Dataset.Keeping in mind the purpose of the study, firstly a sample of roundabouts in operation in several municipalities and rural locations in Slovenia was examined.Crash data were obtained from the Police database for a time period of eight years (years 2009-2016).The dataset included information on the date and the time of day when crashes occurred, condition of signs and markings, environmental conditions including pavement and presence of work zones, type and number of involved users, manoeuvres and road the users came from, and values of Annual Average Daily Traffic (AADT) entering each roundabout.Only total crashes happening at each site were considered, for a total number of 162 crashes.The crashes occurring within 30 meters of the roundabout centre were also included.Twenty-six roundabouts were selected as a representative sample for the later analysis.Table 1 summarizes basic information on the selected roundabouts from Police reports, in some cases integrated by Google maps.The sample included 13 four-legged single-lane roundabouts, 5 double-lane roundabouts (of which a five-legged roundabout and a six-legged roundabout, and the other three four-legged roundabouts), and 8 turbo roundabouts (of which five four-legged and three three-legged turbo roundabouts).
The roundabout features directly related to safety and operational performances had been integrated with on-field surveys.The Annual Average Daily Traffic (over the whole observed period) in turbo roundabouts was between 7,000 and 63,400 vehicles per day; it was between 15,812 and 26,050 vehicles per day for the single-lane roundabouts, while it was between 21,307 and 44,318 vehicles per day for the double-lane roundabouts.The analysis encompassed the turbo roundabouts built since 2009 and some of them were made as reconstruction into a turbo roundabout of already constructed intersections; for this reason, just few crashes were recorded.Table 2 summarizes the main statistics of crash, traffic, and geometric data of the roundabout data sample.

Calibration of Microscopic Traffic Simulation Models.
Before starting the calibration of AIMSUN and VISSIM, a sensitivity analysis was done to determine the model parameters having the best effect on simulated values of steady state capacity as produced by the two software packages.Although literature proposes a wave of methodologies for the calibration of simulation models, there have been no attempts to find general calibration principles based on the collective knowledge and experience [26].Thus, the model output of entry capacity simulated for every category of roundabout was compared to the most well-known empirical capacity function based on the model proposed by [27]; each category of single-lane roundabout, doublelane roundabout, and turbo roundabout was assumed as representative in terms of geometric design and behavioural parameters of the corresponding roundabouts of the dataset.Each capacity function included behavioural headways that were collected in the field and then combined in metaanalysis by [8].For each entry lane, the empirical capacity functions based on a meta-analytic estimation of the critical and follow-up headways represented the target values of empirical capacity to which the simulated capacities were compared; see [28] for the potential that a single (quantitative) meta-analytic estimate provides compared to the results of individual studies on the parameters of interest.Table 3 shows the geometric design and behavioural parameters of every roundabout category used to calibrate AIMSUN and VISSIM.
It should be noted that geometric design of the singlelane roundabout and the double-lane roundabout is consistent with classification of roundabouts worldwide [29,30].The geometric design of single-lane roundabout and the double-lane roundabout here considered also complies with the Italian standards [31] of the compact roundabout and  conventional roundabout, respectively.The design features of the double-lane roundabout also correspond to the layout of the typical double-lane roundabout as proposed by [32], Appendix A, Exhibit A-7.The turbo roundabout design met the turbo geometry presented by [25].Each roundabout typology was then modelled in AIMSUN and VISSIM (see Figures 1-3) in accordance with the geometric parameters in Table 3.
In order to assess each roundabout with the SSAM, the roundabouts were then simulated with desired traffic conditions.Saturated conditions were achieved at entry lanes, so that the maximum number of vehicles entering the roundabout corresponded to the capacity value of each entry lane.Note: a the acceptance range for AIMSUN model parameters is the upper and lower bounds used for GA calibration [23,24]; b the same values of the model parameters were used for each entry lane [25]; c the model parameter ranges from a minimum of 0.90 to a maximum of 1.30 as AIMSUN proposes; d the same GEH indexes were obtained for each entry lane.
A genetic algorithm-based calibration procedure had been developed by [23,24] to determine the parameters of AIM-SUN for the single-lane and the double-lane roundabouts.
In order to calibrate AIMSUN and reproduce realistic traffic on roundabouts, the reaction time, the minimum headway, and the speed acceptance were used as the model parameters.
For the turbo roundabout layout under examination, the AIMSUN calibration was made in a previous work [25]; the reaction time and the minimum headway were used as the model parameters.Table 4 exhibits the default and calibrated parameters of the roundabout models built in AIMSUN.
Based on [26], the GEH index was used to accept (or reject) the model; GEH i is expressed as follows: It denotes that a model reproduces the empirical capacity data if the difference between the simulated (x i ) and empirical capacities (y i ) is smaller than 5 in (at least) 85% of the cases.Thus, GEH equal to 100% means that the difference between the simulated and empirical capacities of the entry lanes is smaller than 5 in 100% of the cases.Note that the acceptance range for the AIMSUN model parameters is the upper and lower bounds used for GA calibration [23,24], while in other cases the acceptance ranges for each parameter are the default ones of the microsimulation model used.
In order to calibrate the roundabouts in VISSIM, the Wiedemann 74 model integrated in PTV VISSIM software (version 10) was selected.The average desired distance between stopped cars, ranging from -1.0 m to +1.0 m (with a standard deviation of 0.3 m), the additive part of desired safety distance, and the multiplicative part of desired safety distance were used as model parameters; for these last two parameters nothing about variation is proposed by VISSIM.Calibration in VISSIM was done manually simulating several replications, adjusting the model parameters and ranging them between successive simulation runs.The optimal setting obtained by the calibration parameters in VISSIM was for each roundabout category as follows.
(i) The Single-Lane Roundabout average standstill distance: the default value is equal to 2.00 m, while the calibrated value is 5.10 m additive part of desired safety distance: the default value is equal to 2.00 m, while the calibrated value is 3.60 m multiplicative part of desired safety distance: the default value is equal to 3.00 m, while the calibrated value is 1.80 m (ii) The Double-Lane Roundabout (Right Lane) average standstill distance: the default value is equal to 2.00 m, while the calibrated value is 1.80 m; additive part of desired safety distance: the default value is equal to 2.00 m, while the calibrated value is 3.05 m multiplicative part of desired safety distance: the default value is equal to 3.00 m, while the calibrated value is 4.75 m (iii) The Double-Lane Roundabout (Left Lane) average standstill distance: the default value is equal to 2.00 m, while the calibrated value is 4.50 m additive part of desired safety distance: the default value is equal to 2.00 m, while the calibrated value is 5.00 m multiplicative part of desired safety distance: the default value is equal to 3.00 m, while the calibrated value is 5.00 m (iv) The Turbo Roundabout (Right Lane and Left Lane) average standstill distance: the default value is equal to 2.00 m, while the calibrated value is 5.00 m additive part of desired safety distance: the default value is equal to 2.00 m, while the calibrated value is 3.10 m multiplicative part of desired safety distance: the default value is equal to 3.00 m, while the calibrated value is 1.50 m Note that the GEH index was below 50% for each roundabout entry lane, when the default values of the model parameters were used; it was greater than 87% when the calibrated values of the model parameters were used.Only, for the left entry lane of the turbo roundabout, the GEH index was below 85%, but only a small number of GEH  was just over 5; thus, the model was accepted.At last, the entry lane capacities simulated with AIMSUN and VISSIM were compared to the empirical capacity functions before introduced; this was made to verify that the calibrated models in VISSIM were actually comparable to the calibrated models in AIMSUN.Three origin-destination matrices of traffic flow percentages were simulated for the calibrated models of the roundabouts as they were representative of the most crucial operating conditions observed in the field (in Table 5).In order to guarantee a base for a homogeneous comparison, an iterative procedure based on [29] was implemented to ensure a desired (pre-fixed) saturation ratio at each roundabout entry and to calculate the total entering flows relative to each matrix of traffic flow percentages (in Table 5).For these purposes, we used the capacity formula proposed by [33]; thus, the entering flows with a saturation ratio of 0.60 were calculated.For the roundabouts under examination, based on matrices in Table 5, the corresponding origin-destination matrices were obtained.For each roundabout of the sample the trajectory files were obtained.In order to produce the trajectory data for each roundabout in Table 1, more than 15 replications of simulation were done in both AIMSUN and VISSIM for the calibrated models; the duration in each replication did not exceed an hour.The 5 simulations that best replicated the origin-destination matrices were then selected.

Calculation of Surrogate Safety Measures from Microsimulation.
The SSAM software analysed vehicle-to-vehicle interactions to identify conflict events and recorded all events happening during the simulation [34].For each conflict event, the SSAM software calculated the surrogate safety measures recorded in the TRJ.files, separately generated by AMISUN and VISSM, including the following [5]: the minimum time-to-collision, the minimum postencroachment time, the initial deceleration rate, the maximum deceleration rate, the maximum speed, and the maximum speed differential.The default filters of the SSAM were not changed during the initial phase of analysis; they were then changed in order to better compare the results obtained by processing the TRJ.files from AIMSUN and VISSIM.Table 6 shows the mean values of normalised total conflicts given by AIMSUN and VISSIM for the roundabouts under examination and the origin-destination matrices of traffic flow percentages in Table 5.More specifically, the values in Table 6 are the total conflicts by each roundabout and each origin-destination matrix in relation to the total simulated entering flow.Table 6 shows that the normalised total conflicts were smaller for the single-lane roundabouts than the double-lane and turbo roundabouts (in case a and case b) with TRJ.files generated by AIMSUN and the default filters of the SSAM.Again, the normalised total conflicts were higher at the turbo roundabouts than the double-lane roundabouts (in case a and case b) with TRJ.files generated by AIMSUN and the default filters of the SSAM.
However, Table 6 shows differences in the mean values of the normalised total conflicts between the SSAM filter-based total conflicts calculated when the appropriate filter values  5; b the mean values of the normalised total conflicts calculated using the TRJ.files generated by AIMSUN both when the default filters of the SSAM were not changed and when the appropriate filters were applied; c the mean values of the normalised total conflicts calculated using the TRJ.files generated by VISSIM when the default filters of SSAM were not changed and when the appropriate filters were applied.
were used and the total conflicts calculated with the default filters of SSAM.
In order to identify which settings influenced the results of the SSAM software, a sensitive analysis was then developed.After several trials, the parameter with a greater effect on the SSAM results was the time-to-collision (TTC) [3,35], the post-encroachment time (PET) [3,35], and the maximum speed (MaxS) [3].It should be noted that smaller values of TTC and PET during a traffic conflict correspond to a greater probability of a collision.Moreover, a TTC equal to 0 is, by definition, a collision; in turn, the value of PET, by definition, should be greater than the TTC [5].The optimal setting, obtained for the aforementioned parameters and the examined cases, was as follows: (i) TTC: the default value of the maximum TTC is 1.50 s, since a value less than 1.50 s can be considered the maximum threshold of TTC [35]; thus, the maximum threshold of TTC was set equal to 1.50 s (ii) PET: the default value of the maximum PET is 5.00 s, while the maximum threshold of PET was set equal to 2.50 s except for double-lane roundabouts where a maximum value of PET of 1.90 s was set for the conflicts produced with TRJ.files generated by VISSIM; the last value of the maximum PET was based on what SSAM recorded with the TRJ.files generated by AIMSUN (iii) the minimum thresholds of TTC and PET were set equal to 0.10 seconds; TTC and PET equal to zero are mere processing errors and were deleted [3] (iv) MaxS: the minimum threshold values are equal to 1.00 meters per second for the single-lane roundabouts and 1.18 meters per second for the turbo roundabouts; the filter of MaxS was not changed for the double-lane roundabouts (v) a filter around the intersection area was applied and conflicts falling within 30 meters before each roundabout entry, since VISSIM identified several conflicts very far from the intersection area that had to be excluded The results of SSAM filter-based total conflicts in Table 6 show a good fit for the frequency of conflicts derived from the two microsimulation models.Indeed, for the traffic cases (in Table 5), the percentage difference of total conflicts calculated with AIMSUN and VISSIM was below 40 per cent.Student's t-test was also carried out to compare the filter-based total conflicts obtained with the SSAM. Figure 4 shows the ttest results for AIMSUN versus VISSIM at roundabouts under examination; see [36] for more in-depth details.The t-test gave non significant results for the single-lane and turbo roundabouts; statistical significance was determined especially at the 0.05 level for the double-lane roundabouts.Based on the above results, traffic conditions and roundabout schemes can have an important effect on roundabout safety: the single-lane roundabout seems less safe than turboroundabout in the case b (in Table 5); unlike cases a and b (in Table 5), double-lane roundabouts are less safe than the single-lane and the turbo roundabouts in the case c (in Table 5), where, unlike case b, the percentage of right turns is higher than that of left turns.

Fitting a Crash Prediction Models Based on Simulated Conflicts
Once the frequency of conflicts obtained by AIMSUN and VISSIM was made comparable by setting some filters of SSAM as introduced above, and conditions were examined under which a safety analysis could be independent of the software being used; a conflict prediction model was developed using AIMSUN.Differently from conventional crash prediction models where crashes per year are the dependent variable and the average daily traffic is the main independent variable, simulation is typically done at the peak hour level.Thus, AIMSUN-simulated peak hour traffic and then peak hour conflicts were estimated.Ten replications were performed for each roundabout and the resulting TRJ.files generated from AIMSUN were processed with the SSAM software to identify conflicts based on the procedure described in the previous sections.Table 7 summarizes the main statistics for type of conflict and total conflicts of all the roundabouts of the sample in Table 1.However, total conflicts only were considered to fit the model since low conflicts by type resulted except for the rear-end type.
In order to develop a prediction model for total crashes versus total conflicts, peak hour conflicts were modelled  against crashes per year (occurring during all hours) by incorporating an extra variable to capture the effect of the ratio of peak hour traffic volume to average daily traffic volume [3]; only the outer diameter was introduced as further covariate of the model, while other covariates did not result significant.
It should be noted that a sensitivity analysis was done to test several geometric and traffic features (i.e., entry width, ring width); however, only the variables that were significant were selected as the explanatory variables of the model.Based on state-of-the-art in safety modelling [37], in order to fit the model, a generalized linear model framework was used as available in the statistical package GenStat [38].Since the data had a variance slightly larger than expected under the assumption of a Poisson distribution (i.e., the variance is equal to the mean), equidispersion assumption was relaxed to avoid model specification errors.It is quite well known that the most common approaches are a quasi-likelihood with Poisson-like assumptions (i.e., the quasi-Poisson from now on) and a Negative Binomial model; these models are derived from the Poisson model and allow the mean to differ from the variance when data exhibit overdispersion [39,40].However, in the statistical literature, especially for the regression case, little guidance can be found when the specification of a quasi-Poisson or a Negative Binomial error structure has to be performed [41].Since, for any given datasets, one can find cases where each model produces a good fit to the data, goodness-of-fit criteria helped us to choose between the two above introduced models.First, in order to employ the regression technique to relate the actual crash frequency to the AIMSUN-simulated conflict frequency predicted by the SSAM, the functional form of the model was selected.Real-life crashes and conflicts were assumed as discrete random events with a non normal error structure [5].Consistent with the model forms introduced for the conflict prediction models [3], the power function was here assumed and used to develop the total crash model as follows: where E[] is the expected number of total crashes per year (i.e., the dependent variable), X i(i= 1,2,3) are the explanatory variables, and  and  i(i= 1,2,3) are the regression parameters to be estimated using the maximum-likelihood procedure.The peak hour conflicts (X 1 ) generated from AIMSUN simulation and the SSAM analysis, the peak hour traffic ratio (X 2 ), or the ratio of peak hour traffic volume to average daily traffic volume, and the outer diameter (X 3 ) of the selected roundabouts were selected as the explanatory variables of the model.The peak hour ratio was considered an exploratory variable since it could vary from roundabout to roundabout and depended on the road classification, location, day, date, and time of the peak hour counts.Table 8 shows the parameter estimates with two different distributions in GLM framework.The constant value () was not statistically significant for both models, while the estimates of  1 ,  2 , and  3 were statistically significant (at the 5% level and 10% level) in both cases.The table also shows the measures of goodness-of-fit discussed by [42] (1) the mean prediction bias (MPB); a positive (or negative) MPB denotes that a model over predicts (or under predicts) crashes; (2) the mean absolute deviation (MAD) that measures the average dispersion of the model; (3) the mean square prediction error (MSPE) that is used in conjunction with the mean squared error (MSE): an MSPE higher than MSE indicates that the models are overfitting the data and that some of the observed relationships may have been spurious instead of real.Other measures of goodness-of-fit were the mean error (ME) and the mean normalized error (MNE) which are useful when applied separately to measurements at each location instead of to all measurements jointly [26].Table 8 also shows the GEH index (see (1) ), and Pearson product moment correlation coefficient (r Pearson ) between observed and predicted crashes.As further information about the goodness-of-fit, the method of cumulate residuals (CURE) was applied as dealt with in next section.

Results and Discussion
The results in Table 8 show a reasonably good fit for the data; however, the quasi-Poisson model fits the data better than Negative Binomial model and produces a slightly better prediction accuracy: the mean prediction bias (MPB) of the quasi-Poisson model was lower than the NB model, similarly to the mean absolute deviation (MAD) and the mean error (ME).For the quasi-Poisson model the MSPE also was lower than MSE compared with the other model; however, each model did not show signs of overfitting since they had an MSPE value lower than the MSE value and confirmed that no important variables were omitted from the model or the models were misspecified.
Comparisons between models, however, are not always easy; the differences in goodness-of-fit can suggest cases in which models could be improved, but improvements might be difficult to obtain.The GEH index and Pearson coefficient also highlighted how well the models fit the set of observations; however, Pearson coefficients for both Note: N is the data sample size, and ŷ is the fitted value of y  , which is the actual measurement; ŷ is the mean value of the fitted values, while  is the mean value of the actual measurements; dof stands for degree of freedom; r Pearson stands for Pearson product moment correlation coefficient.( * ) Note that in GenStat the dispersion parameter (fixed or estimated) is used when calculating standard errors and standardized residuals.In models with the Poisson and negative binomial, as well as geometric and exponential distributions, the dispersion should be fixed at 1 unless a heterogeneity parameter is to be estimated.models showed marginal differences in goodness-of-fit that could be explained by random fluctuations in the observed data, however negligible.As further information about the goodness-of-fit, the method of cumulate residuals (CURE) was applied and CURE plots were developed [1].The cumulative residuals, defined as the difference between the actual and the fitted values for each observation unit, were arranged in increasing order of the fitted value and computed for each observation unit.Figure 5 shows how well the model under the quasi-Poisson assumption fits the data as a function of a specific variable of interest; for example, as variable of interest the total conflicts were selected for this comparison.The cumulative residuals on the vertical-axis were plotted against the total conflicts on the horizontal-axis.The indication is that the fit is fairly good especially for the quasi-Poisson model since the cumulative residuals, oscillating around the value of 0, lie between the confidence limits of the standard deviation (± 2  * ).Although a horizontal stretch of the CURE plot corresponds to a region of the variable where the estimates can  be unbiased, the CURE plot (see Figure 5(a)) for the quasi-Poisson model is inside the confidence limits; thus, one can observe that the calibrated model fits the data very well, while for the case of Negative Binomial model a portion of the CURE plot was outside the confidence limits (see Figure 5(b)).In order to assess the overall quality of the model fit [1], the fitted value-based CURE plots were prepared both for the quasi-Poisson model (Figure 6(a)) and for the SSAM model (Figure 6(b)), which is a nonlinear regression model for crashes as a function of total conflicts [5].
In Figure 6 each plot shows how well (or poorly) the model predicts, not for a specific variable but overall, as a function of number of crashes expected on each unit.The CURE plot in Figure 6(a) for the quasi-Poisson model is closer to a random walk around the horizontal-axis than the plot in Figure 6(b) and it is inside the confidence limits.The CURE plot of the SSAM model for total crashes versus total conflicts shows long increasing and decreasing runs corresponding to regions of consistent over-and underestimation [1].In the last case, the safety performance capability of the SSAM crash-conflict model to predict real-world crashes with actual crash experience at Slovenian roundabouts falls.The occurrence of traffic conflicts also was sensitive to the site configuration and priority rules and other parameters in the microsimulation.This confirms again that the safety assessment of a road entity based on surrogate measures of safety is influenced on microscopic traffic simulation model used.

Conclusions
This paper addresses issues on evaluation of roundabout safety performance through surrogate safety measures from microsimulation.Roundabouts were selected since they are becoming increasingly attractive to transportation engineers, and the effectiveness of proper measures and assessment tools for road safety management is still being studied.Based on a sample of Slovenian roundabouts, surrogate safety measures were obtained through microscopic traffic simulation models; then a crash prediction model from simulated peak hour conflicts was developed.
For these purposes, the vehicle trajectories records exported from AIMSUN and VISSIM were used to estimate traffic conflicts through the SSAM.AIMSUN and VISSIM were calibrated for single-lane, double-lane, and turbo roundabouts using the corresponding empirical capacity function which included critical and follow-up headways estimated through meta-analysis.In order to bring the simulated traffic conflicts from VISSIM and AIMSUN to a comparable level, some SSAM filters were set iteratively (i.e., setting lower values of the TTC and PET than the default values, and eliminating the conflicts corresponding to a zero value of TTC and PET).The effect of different traffic scenarios on roundabout safety performance was also tested.It was noted that a different flow distribution provided a different number of conflicts at roundabouts; there was a traffic scenario that provided more (potential) crashes than other scenarios for the same roundabout category.
Once the outputs from the two microsimulation software got to a comparable level, a crash prediction model for the sample of Slovenian roundabouts was developed.Although a large number of practitioners and transportation engineers during the last decade have been using traffic microsimulation in lots of practical applications, technical literature still presents few studies which focus on the relationship between crashes and simulated traffic conflicts especially at roundabouts.This is the gap in the current literature that the paper aimed to address.A generalized linear model framework was used to estimate the prediction model based on traffic and crash data collected in the field at 26 existing roundabouts.Peak hour traffic distribution was simulated with AIMSUN, and peak hour conflicts were then estimated with the SSAM.The model was developed with crashes per year as dependent variable and peak hour conflicts and the ratio of peak hour traffic volume to average daily traffic volume and the outer diameter as independent variables.The CURE plots also showed a good quality of the fit.
Two main conclusions may be derived from the research results that are also useful for professional or other practical issues.The comparison between the surrogate measures of safety based on the simulated trajectories derived from AIMSUN and VISSIM provided insights on how to set the SSAM settings so that the outputs from AIMSUN and VISSIM reaching a comparable level.The outcome of this first activity represented the starting point to address issues associated with the development of safety prediction models for roundabouts based on surrogate measures of safety.Although the paper does not address a model selection problem (to be solved by a data-driven method), it informs on how intersection safety can be estimated by using simulated conflicts instead of real crash data and other covariates.The coefficient estimates of the crash-conflict model based on real data were statistically significant; however, the model was quite different from the model recommended by the SSAM to identify conflicts from traffic simulation.Nevertheless, it should be noted that the results are based only on a sample of 26 roundabouts within the same country.Thus, future research efforts could be addressed to acquire further roundabout data from other sources in order to improve the statistical link between observational crashes and simulated measures of safety.Further roundabout data, together with other traffic scenarios to be tested, could improve the same reproducibility and accuracy of the simulated output, considering also a better explanation of the actual crashes.
Since the results, within the limits of this study, confirm that surrogate measures of safety strongly depend on microscopic traffic simulation model which is used, they are sufficiently encouraging to continue the line of research.
The results confirmed that the safety assessment of any road entity may rely on surrogate measures of safety, and the simulated conflicts can be used as a promising approach for roundabout safety evaluation.Fundamental design considerations should be also evaluated at a planning level to better understand potential impacts for each roundabout alternative.Designing a roundabout, indeed, requires the optimal balance between safety, operational performance, impacts, and so on, given the constraints for the site under evaluation.Future developments can interest the use of surrogate measures as a sound basis for comparing performances of alternative intersection types.Traffic microsimulation could be a valuable approach to investigate how safety and operational conditions will change when Conventional Vehicles (CVs) and autonomous vehicles (AVs) are coexisting in traffic, since the introduction of on-road autonomous vehicles (AVs) in traffic will inevitably transform the criteria for road network design, traffic modelling, and road safety management.In this view, automated road safety analysis based on reliable safety evaluation tools using surrogate safety measures can be useful to provide prompt safety estimates and to address innovative vehicle and infrastructure developments.

Figure 1 :
Figure 1: The single-lane roundabout model in simulation environment.

Figure 2 :
Figure 2: The double-lane roundabout model in simulation environment.

Figure 3 :
Figure 3: The turbo roundabout model in simulation environment.

Figure 4
Figure 4: t-test results for VISSIM versus AIMSUN at (a) single-lane roundabouts, (b) double-lane roundabouts, and (c) turbo roundabouts.Note: T critical (=0.05)= 2.31; T critical (=0.01)=3.36; average means the mean value of total conflicts in simulation replications; the t-test was not significant for single-lane and turbo roundabouts in cases a, b, and c, while t-test was significant for the double-lane roundabout in cases a and b (at the 0.05 level) and case b (at the 0.01 level).

Table 2 :
The main statistics of the roundabout data sample.

Table 3 :
Geometric design and behavioural parameters of every roundabout category.

Table 4 :
Default and calibrated values of the model parameters in AIMSUN.

Table 7 :
Summary of main statistics for type of conflict and total conflict.

Table 8 :
Parameter estimates for crash models based on AIMSUN simulated conflicts and goodness-of-fit.