Random Parameter Negative Binomial Model of Signalized Intersections

Factors affecting accident frequencies at 72 signalized intersections in the Gyeonggi-Do (province) over a four-year period (2007∼ 2010) were explored using the random parameters negative binomial model. The empirical results from the comparison with fixed parameters binomial model show that the random parameters model outperforms its fixed parameters counterpart and provides a fuller understanding of the factors which determine accident frequencies at signalized intersections. In addition, elasticity and marginal effect were estimated to gain more insight into the effects of one-percent and one-unit changes in the dependent variable from changes in the independent variables.


Introduction
Improvement of road safety has become an increasingly important issue throughout the following road-construction processes: planning, design, construction, and operation and maintenance.In many cases, road safety issues are analyzed by comparing the relationships between accident frequencies and factors including traffic volume, weather conditions, characteristics of drivers and vehicles, and geometric conditions.Intersections in particular are important places where diverse treatments are needed for accident reduction because of the high instances of vehicle-vehicle and vehicle-pedestrian conflicts [1][2][3][4][5][6].Moreover, Korea is ranked first among OECD nations in number of traffic accidents, and more accidents have occurred in and around intersections than at other road segments in Korea.Studies have shown that 80.4 percent of places where more than 100 accidents have occurred in the past five years are in or around intersections [7].For those reasons, improving safety at signalized intersections is one of the most important issues in safety improvement projects.
For this reason, the purpose of this study was to investigate the relationships between accident occurrences and intersection conditions that affect intersection safety.The data used in this analysis were from major intersections in the Gyeonggi province, where about 25 percent of the entire Korean population (13 million people) live.Although there have been numerous studies that have attempted to understand factors that influence accident frequencies at intersections in Korea using various statistical modeling methods (linear regression [8], Poisson model [9,10], negative binomial model [11][12][13], and logistic regression [14]), few researchers have used a random parameter count models as another methodological alternative in accident frequencies analysis [15,16].
From the perspective of statistical modeling, the Poisson and negative binomial count models have typically been used for traffic accident analyses.Negative binomial models in particular are commonly used because they account for the overdispersion problem that generally occurs in traffic accident frequency data [17][18][19][20][21][22][23].Despite using the more accurate negative binomial model instead of the Poisson model, the parameters of these traditional count models were assumed to be fixed when they can actually vary across observations; thus the heterogeneity problem remains unsolved.By unobserved heterogeneity, the effect of an independent variable on accident frequencies may vary for different observations; one intersection with high traffic volume may 2 Mathematical Problems in Engineering have a higher frequency of accidents compared to a similar intersection with a lower traffic volume.Moreover, weather conditions such as heavy snow and rain could contribute to the frequency of vehicle collisions by limiting visibility and steering control, which are obviously factors in accident frequencies but are not revealed in aggregated accident data.Therefore constraining the parameters to be constant (fixed) could lead to inconsistent and biased parameter estimations when parameters actually vary across observations [24].
This study investigated how heterogeneity effects vary at intersections and found specific heterogeneity effects of unobserved variables, including some of intersection geometrics, traffic characteristics, driver behavior, and other unobserved variables which are not implicitly accounted for in the data.These are useful variables to establish effective and proactive safety guides and policies to improve safety at intersections.To achieve these objectives, a fixed parameters negative binomial model with constant parameters and a random parameters negative binomial model were developed, and the modeling results were compared to explore which approach may be appropriate (this statement means that the traditional negative binomial model (fixed parameters) is suitable for some cases and random parameters binomial model is suitable for others; so all considerable models need to be taken into account in the accident frequency analysis).

Methodology
Since the number of accidents consist of a nonnegative integer, count data modeling is commonly used in accident frequency analysis.As mentioned above, Poisson and negative binomial models are the main methods of count data analyses, and the basic Poisson model is presented in where (  ) refers to the probability of intersection  having  accidents and   is the Poisson parameter for intersection  (equal to 's expectation value, (  )).The Poisson model specifies the parameter of the expected number of accidents (  ) and is used when data are not significantly overdispersed ([  ] < VAR[  ]) by using a log-linear function: where X  is a vector of independent variables and  is a vector of estimable parameters.Conversely, a negative binomial distribution (based on a Gamma-distributed error term) is commonly used when the data are overdispersed by assuming where exp(  ) is a Gamma-distributed error term with mean 1 and variance  and  are the same as in the Poisson model.Although these count models have been successfully used to correlate accident frequencies and variables related to accident occurrences, the traditional Poisson and negative binomial models cannot consider the heterogeneity of observation .Thus, standard errors in the regression coefficients were underestimated (-ratios were inflated), and subsequently the reliability of the entire model was reduced.
To account for problems of unobserved heterogeneity, Greene [25] developed simulated maximum likelihood estimation procedures to incorporate random parameters that could vary across observations into Poisson and negative binomial models.The estimable parameters are expressed by the following: where where (⋅) refers to the probability density function of   .
Because the numerical integration of the Poisson and negative binomial models with random parameter distribution is computationally cumbersome, a simulation-based maximum likelihood method is used to maximize the simulated loglikelihood function.To perform this process, a Halton draw has been imposed.Previous studies [15,16,[24][25][26][27] found that the Halton draw provides a more efficient distribution of draws for numerical integration than random draws.
Once the coefficients of the parameters are estimated, the elasticity process can be conducted to measure the true effect of the independent variables on accident frequency.Shankar et al. [20] recommended the use of elasticity, which can be roughly interpreted as the percentage change in the average frequency of accidents caused by a one-percent change in the independent variable and can be defined as which means elasticity of accident frequency with respect to the th independent variable for section .With the negative binomial model ( 3), (6) gives The elasticity in ( 6) is only valid for continuous variables such as traffic volume, lane width, and length and not for noncontinuous variables such as dummy variables taking on zero value or one value.
Marginal effect [24] is another way to interpret the effect of an independent variable, typically, indicator and some integer variables, on a dependent variable, which reflects the effect of a "one-unit" change of an independent variable on the dependent variable, calculated as the partial derivative   /  .Although it sounds similar to elasticity, the marginal effect measures the effect on the dependent variable from a one-unit change in the independent variable.
Reporting one or the other-but not both-is a common way since both elasticities and marginal effects determine the impact of specific variables [24].

Model Estimation
3.1.Data.The data set used in this study is composed of accident records, traffic flow information, and geometric features for 76 signalized intersections during 4 years (2007-2010) in Gyeonggi province, Korea.Accident data were developed based on police reports, and data of traffic volumes and geometric conditions were developed from field data collection and design drawings.Each intersection's geometric variables and traffic volumes for all movements on major and minor roads were used in establishing the models to analyze their relationships with accident frequencies at signalized intersections.The descriptive statistics for the primary variables used in the modeling are shown in Table 1.
As shown in Table 1, the mean and maximum annual crash frequencies are 10.6 and 57, respectively.The mean number of lanes on major and minor streets are 15.82 and 10.22, and lane widths on major and minor streets are 53.84 and 34.27 meters, respectively.In the estimated model, traffic volume was used in a logarithmic form, which yielded statistically better fits.The number of entrances and exits had a maximum value of six on both major and minor roads.The length of left-turn exclusive lanes had a mean of 280 meters on major streets and 140 meters on minor streets, and the mean of a median barrier's length was 32 meters on major streets and 12 meters on minor streets, with a maximum length of 280 meters and 140 meters, respectively.The last geometric value was for a right shoulder width of around 0.6 meters on both major and minor roads.

Model Results and Findings.
As the first process to develop the best statistically fitted model, all independent variables described in Table 1 were taken into consideration to find statistically significant variables, and insignificant variables were eliminated step by step.The fixed parameters negative binomial model without heterogeneity effects was derived as well to determine which model best explained the relationship between geometric features, traffic volume, and traffic accident frequency.As a result, it was found that the negative binomial model was more appropriate than the Poisson model since the dispersion parameter is statistically significant with a -value of 20.12.Based on this result, the negative binomial model was developed and provided as the final fixed parameters count model.
To develop the random parameters negative binomial model, simulation-based maximum likelihood with 200 Halton draws was used to estimate parameters.The number of 200 in Halton draws was selected because it has been found to produce consistent and accurate parameter estimates [27][28][29][30] and this Halton draws parameter estimation method was proved as well in previous research [27,28,31] as a method to estimate empirically accurate parameters.With regard to the random parameters' density functional forms, the normal distribution gave the best statistical results among the normal, uniform, and lognormal distributions.The model development results and the marginal effect and elasticity of the random parameters and fixed parameters models are explained in Tables 2 and 3, respectively.
First, the overall log-likelihood at convergence in the random parameters model, −937.131,shows a relative improvement from the log-likelihood starting value of −965.224, which is a baseline in the fixed parameters model.A total of 14 variables were found to affect intersection safety, 9 of which had random parameters with a statistically significant standard deviation, meaning that their effects on accident occurrences could vary by observations.Generally, random parameters are adapted if the standard deviation of the parameter density is statistically significant.If the standard deviation value of the parameter is not statistically significant, a fixed parameter model is adapted, and the parameter is fixed (constant) across the population.The following variables were found to have random parameters where the standard deviation of the parameter's distribution was statistically different from 0: the number of major road lanes, lane width of major roads, logarithm of heavy vehicle volume turning left on major roads, logarithm of total traffic volume driving straight on major roads, logarithm of total traffic volume turning left on major roads, logarithm of total traffic volume turning right on minor roads, length of median barrier on major roads, length of median barrier on minor roads, and the existence of a traffic island on major roads.
The interpretation of the parameters' effects begins with an analysis of fixed parameters.The coefficient of a left-turn exclusive lane length on major roads was shown to have a negative sign, meaning the number of accidents decreases as the length of the left-turn exclusive lane increases.In other words, the length of left-turn exclusive lane, that is, capacity for left-turning vehicles, is an important factor in accident reductions because the number of conflicts between vehicles driving straight and vehicles turning left can be reduced by increasing the length of the queue.Marginal effect values show that a one-meter increase in left-turn exclusive lane length results in average 0.064 (random parameters model) and 0.09 (fixed parameters model) decrease in the number of accidents.In the same vein, existence of left-turn exclusive lane on minor roads resulted in a reduction in average accident frequency.
It was found that the existence of a pedestrian crossing on minor roads decreases the frequency of accidents.Conversely, a pedestrian crossing on major roads does not have a statistically significant effect on the frequency of accidents.
The coefficients of existence of median barriers on both minor and major roads had a negative sign, which is consistent with the expectation that the frequency of crashes is lower in the presence of median barriers, which were installed to separate traffic flowing in two different directions.
In terms of the random parameters, the variable for number of lanes on major roads has a random parameter with a normal distribution having a mean of 0.226 and a standard deviation of 0.043.This result indicates that the number of lanes on major roads positively affects accident frequency.In other words, the number of accidents increases as the number of lanes increases at most intersections.Since the number of lanes is correlated with exposure rate on the roads, the likelihood of vehicle crashes increases with the number of lanes.The average marginal effect for this variable shows that additional one more lane on major road will result in an average 1.414 increase (random parameters model) and 2.215 increase (fixed parameters model) in the number of accidents.
Lane width on major roads has a normally distributed random parameter with a mean of −0.065 and a standard deviation of 0.030.Given these distributional parameters, 98.56% of intersections show a decrease in accident frequencies as lane width increases, and 1.44% of intersections show an increase in crashes as lane width increases.This result implies that the likelihood of vehicle crashes usually goes down because wide lanes provide more comfort space (forgiveness) than narrow lanes for drivers, especially for drivers experiencing poor weather conditions.The parameter estimates translate into a unit increase in lane width of major roads decreasing the number of accidents by an average of 0.409 in random parameters model and 0.729 in fixed parameters model.Four variables were found to have random parameters with respect to traffic volumes.The volume of heavy vehicles turning left on major roads has a normally distributed parameter with a mean value of 0.511 and a standard deviation of 1.668 that is negative for 37.98% of intersections and positive for 62.02% of intersections under the normal distribution.In terms of elasticity, increasing the number of heavy vehicles turning left by 1% positively affects 0.51% (random parameters model) and 0.43% (fixed parameters model) of average accident occurrences.The other random parameter variables related to traffic volumes included the volume of vehicles turning left on major roads and the volume of vehicles turning right on minor roads, both of which resulted in increasing accident frequency at most intersections.Total traffic volumes driving straight on major roads had a mean of 0.271 and a standard deviation of 0.260, which means the likelihood of accidents increases by 36.89% at some intersections and decreases by 63.11% at others.The elasticity of these variables indicates that a 1% increase of traffic volume contributes to 0.271% (going straight), 0.32% (turning left), and 0.66% (turning right) increases in the random parameters model and 0.26% (going straight), 0.37% (turning left), and 0.39% (turning right) increases in the fixed parameters model.The reason for those results can be related to exposure rate, which is described by the number of lanes.As the exposure rates such as number of lanes and traffic volume on the road increase, the likelihood of accident occurrences increase as well [32,33].
The length of a median barrier on major and minor roads results in a normally distributed random parameter with a mean of 0.02 and 0.018 and a standard deviation of 0.002 and 0.003, respectively, implying that accident frequency increases as the length of the median barrier increases at most intersections.This result is interesting with consideration of the previous existence of median barrier result showing negative sign.Although a median barrier generally improves traffic safety by separating opposing traffic flow [34,35], there were conflicting results in terms of the positive and negative effect on crash occurrences [36,37].Considering the existence of median barrier result, it suggests that the median barrier has an impact on accident frequency reduction because as the length increases, it could also be obstacle to drivers, especially if they are installed on narrow median lanes.This might lead to a great likelihood that the median will be struck due to its nearness to moving vehicles [37].
Although the median barrier clearly has a positive effect in reducing accident severity, some contradictory results could be derived with respect to accident frequency [36,37].
The last variable with a random parameter is the existence of a traffic island on major roads.The derived mean of −0.806 and standard deviation of 0.386 show that, under the normal distribution, 98.16% of intersections experience a reduced number of accidents while 1.84% of intersections experience an increased frequency of accidents.This reaffirms the pedestrian island's important role in accident reductions between vehicles and pedestrians.However traffic islands on minor roads were not found to be a statistically significant variable.

Conclusions and Recommendations
The relationship between accident frequencies and various driving condition variables, including geometric conditions, traffic volume, and other variables at 76 signalized intersections from 2007 to 2010, was investigated using fixed parameters and random parameters models.Most agencies, such as institutes and corporations in Korea, use the fixed parameters model to predict the number of accidents and determine which road segments should be prioritized for improving safety.The fixed parameters negative binomial method, however, has a significant limitation in the degree of uncertainty and randomness through accident predictions.
The random parameters model suggested in this paper is an important methodological approach since it takes into account and corrects for heterogeneity that could arise from factors such as vehicles, road environment, weather, and other unobserved factors not captured in the collecting data process.In this way, its fit was found to be better than that of the existing fixed parameters model by deriving some parameters as random while leaving others as plausibly fixed.This process could be confirmed by testing variables' impact on accident frequencies [22].Some geometric features can increase the likelihood of accidents in some locations while decreasing it in others.
Nine independent variables were found to have statistically significant random parameters that affect intersection safety differently at different places.These were the number of major road lanes, the lane width of major roads, the logarithm of heavy vehicle volume turning left on major roads, the logarithm of total traffic volume driving straight on major roads, the logarithm of total traffic volume turning left on major roads, the logarithm of total traffic volume turning right on minor roads, the length of median barrier on major roads, the length of median barrier on minor roads, and the existence of a traffic island on major roads.
The proposed model provides insights into safety effects of the geometry at intersections to benefit new construction or control of intersections.For example, it was illustrated that left-turn exclusive lanes are an important factor for accident reductions by providing enough length for the left-turn waiting queue.Based on the result, it can be concluded that the left-turn exclusive lane would better need to be installed where enough space for adding left-turn exclusive lane is allowed.In addition, the decision for the proper length of the left-turn exclusive lane based on traffic conditions related to intersection capacity would be another academic issue to improve the safety and the efficiency at the intersection.In terms of median barriers, data in this study could not consider the width of the median lane for barrier installation, which numerous previous studies have included.By including the width data in future studies, guideline for median barrier construction could be established and help to improve intersection safety.
Although the random parameters model was shown to yield better likelihood than the fixed parameters model in this investigation, the random parameters model is not always the best fit for all data.For exploring the best model, it is recommended that all considerable models should be used in accident frequency analysis.In addition, using the higher number of Halton draws is recommended to better ensure the quality of the estimates.
refers to a randomly distributed term (e.g., a normal distribution with mean zero and variance  2 ).Here, the parameter becomes   |   = exp(  X  ) in the Poisson model and   |   = exp(  X  +   ) in the negative binomial regression model.

Table 1 :
Description and statistics of variables.

Table 2 :
Estimation results for random parameters and fixed parameters negative binomial models.

Table 3 :
Marginal effect/(elasticity) of random parameters and fixed parameters negative binomial models.