Development of Accident Frequency Models with Random Parameters on Interstate Roadway Segments with and without Lighting Systems

This study explored factors affecting traffic accidents in roadway segments with and without lighting systems using a random parameter negative binomial model.This study sought to make up for a shortcoming of the fixed parameter model that constrained the estimated parameters to be fixed across observations, by applying random parameters that can take into account unobserved heterogeneity. Three variables had a random parameter among nine significant variables in segments with lighting systems, while seven of the eleven significant variables in a segment without a lighting system had random parameters. The different influence of interstate highway geometrics on vehicle crashes with and without lighting systems found through this study considering unobserved heterogeneitymay hopefully help reduce accident frequencies and consider installation of lighting systems on interstate highways in the future.


Introduction
A high number of crashes on the roads have still occurred despite the given excellent design guides and advanced technologies for highway design and construction.Under this situation, road and traffic authorities are working hard to reduce traffic accident frequency and minimize accident severity.As the first step in developing effective countermeasures, finding the causes of traffic accidents is the most important thing and a majority of studies focused on analyzing the causes of traffic accidents using statistical model approach.Count data models such as Poisson/negative binomial models had been used popularly, which could explore the causes of traffic accident in road segments/intersections [1][2][3][4][5][6][7].Models derived from Poisson/negative binomial models such as zero-inflated negative binomial models [8,9] and negative binomial models with random effects [10] were also used in the analysis.
All of those statistical prediction models introduced above have been used to investigate cause of crashes and to reduce the number of traffic accidents.However, these assumed that independent variables (geometrics or traffic volumes) are constant/fixed across the observations (segments or intersections).In other words, the magnitudes of independent variables' influence on accident frequencies were the same for all individual observations.A previous study pointed out that the use of a constant/fixed parameter, which could vary widely in actual observations, could lead to inconsistent and biased results [11].Another study noted that count models using fixed parameters could lead to underestimation of standard errors in regression coefficients and inflated t-ratios [12].The reason for this problem is that there could be unobserved heterogeneity factors influencing accident frequency such as different geometrics, vehicle and human factors that could create variation in the impact on accident occurrences [13].Those factors are not implicitly accounted for in the accident data and to account for the unobserved heterogeneity, random parameter models having assumption that the estimated parameters may vary across observations/individuals had been introduced.Previous studies suggested that using random parameters to mitigate the shortcomings of conventional models such as fixed Poisson and negative binomial models and to account for heterogeneity is beneficial even though the overall explanatory power of fixed and random parameter negative binomial models is similar [12,14,15].
The purpose of this study is to determine how geometric elements of interstate highways influence traffic crashes in roadway segments where lighting systems are or are not installed.By doing this process, guideline of lighting system installation which determines the segment whether lighting system needs or not could be established.Particularly, in Korea, lighting systems are installed only at interchange segments and rest areas on expressways, and most of roadway segments on expressways have no lighting systems because they are not recommended in the lighting system installation guidelines.However, many researchers pointed out the need of light system on roadway segment in expressway, and the lighting system installation guideline may include guidance on roadway segments in expressway in the future [16].Therefore, this study was conducted to find more proofs and needs of installation of light system entire expressway segments to increase safety by comparing the accident frequency model in terms of being with and without lighting system.For this study purpose, all records of roadway segments (except interchange segments) from the entire network of 7 interstates (I-5, 82, 90, 182, 205, 405, and 705), total length of 1,528 miles in Washington State, USA, over a 9-year period were used.
Overall, the analysis of traffic accidents of roadway segments related to lighting systems applying a random parameters model will presumably provide substantial insight on the effect of road geometrics on the likelihood of accidents with and without lighting systems.Furthermore, the results of this study could help set future policies or guidelines for the installation of lighting systems on expressway in Korea.

Methods
Generally, count models, for example, Poisson and negative binomial models, are used to analyze crash frequency, which has a nonnegative value.A Poisson regression model is used when the mean value of the data is the same as the variance of the data ((  ) = Var(  )).On the other hand, a negative binomial regression model is used when the data are overdispersed ((  ) < Var(  )), which arises in most accident frequency data.The negative binomial model is derived by considering an error term as where exp(  ) is a gamma-distributed error term with mean  and variance  and  is an estimated parameter.
By integration of the error structure, the marginal distribution in a form of negative binomial distribution is derived as follows: where  refers to the dispersion parameter.Using (2), the variance of observations is allowed to differ from the mean, Here, the negative binomial model could be regarded as an expanded model of the Poisson regression model when  is not statistically different from 0. Therefore, the negative binomial model is more suitable than the Poisson regression model when  is statistically significantly different from 0, while the Poisson regression model is more suitable in other cases [11].However, the traditional Poisson/negative binomial model assumes that parameters are fixed across observations.This results in an overestimation of the t-value of the estimated coefficients from an underestimation of standard errors due to the presence of unobserved heterogeneity across all observations.To improve this problem, Greene developed a method to incorporate random parameters in count models [17] and parameters which allow for random parameters can be expressed as follows: where   is a randomly distributed term (Poisson regression model:   |   = exp(X i ) and negative binomial regression model: In this sense, the log-likelihood can be rewritten as where (⋅) refers to the probability density function of   .Since it is difficult to numerically integrate the count models with a random parameter distribution, a simulationbased maximum likelihood method was employed, which maximized the simulated log-likelihood function.In this respect, previous studies have shown that the Halton draw method can be used to estimate empirical parameters [18][19][20].
After the estimation of parameter coefficients, the relative significance of the variables was verified.To accomplish this, the elasticity value to measure the relative effect of the variable on accident frequency was recommended [1].Elasticity can be roughly interpreted as the percentage change in the average accident frequency due to a one-percent change in the independent variable and elasticity is computed as follows: This equation refers to the elasticity of accident frequency in terms of the th independent variable for observation .
The marginal effect was also used to inspect the relative effects of specific independent variables to show a good approximation of the amount of change in dependent variables produced by 1-unit changes in the independent variable.This was calculated as the partial derivative   /  .
To judge the overall goodness-of-fit of the estimated model, McFadden's pseudo ( 2 ) is used by comparing the value of log-likelihood with random parameters with the log-likelihood with fixed parameters, which is performed by calculating the following ratio.
Here, LL() is the value of the log-likelihood using the fixed parameters, and LL() is the value of the log-likelihood using the random parameters.

Data Description
Data used in this study were collected from interstates 5, 82, 90, 182, 205, 405, and 705, which are the seven major interstates in Washington State, USA, and over a span of nine years (1999 to 2007).The data includes number of crashes, directional traffic volumes, and geometric conditions.The data are structured in an unbalanced-panel data type and are divided into roadway segments and interchange segments.
The reason for unbalanced-panel data is that, since it was difficult to collect history data for changes in geometric conditions and lighting systems, the data for geometric conditions and lighting systems were the same for all years.Accident counts and traffic volumes were the variables that changed year by year.These data were defined by mile post; the beginning mile post at the edge of the on-ramp and the ending mile post at the off-ramp defined the total length of the interchange segment for both directions.Roadway segments were defined as the segments starting from the ending mile post of one interchange to the beginning mile post of the next interchange.Of a total of 1,153 segments, 205 have lighting systems, 385 are without lighting systems, and the remaining 563 segments are interchange segments that are not considered in this study.Since various conditions of interchange segments are different compared to those of roadway segments due to difference of segment functions, traffic flow changes, weaving maneuvers, and complex infrastructures, authors focused on only roadway segments in this study, and cases of interchange segments were left for a future study.Table 1 gives a description of key variables used in developing models, and descriptive statistics of the variables are divided into segments with or without lighting systems.The average length of the segments with a lighting system is The number of lanes and left/right shoulder width are divided into "satisfying standard (standard)," "under value of design standard (low)," and "over value of design standard (high)" based on the manual (Table 2) proposed by the WSDOT (Washington State Department of Transportation).The ratio of the three standards represented in the segment concerned is applied meaning the maximum value should not exceed 1.The proportion of segments designed by the number of lanes that the WSDOT recommends for interstate highway, 5 lanes in each direction, was almost zero.And 67% of segments had proper width of left shoulder comparing the WSDOT design standards segments (4 ft) in both cases and, on the other hand, 59% and 70% of segments had proper width of right shoulder comparing the WSDOT design standards (10 ft).There were an average of two horizontal curves and an average of three vertical curves for segments with lighting systems and four segments without lighting systems.Among the design criteria, grade of vertical curves, radius of horizontal curves, and region were not considered in this study due to data constraints.

Estimated Results
As explained above, traffic crash frequency prediction models were developed using random parameters negative binomial models in roadway segments with and without lighting systems and the estimated results are shown in Tables 3 and  4.
In order to develop a model with the best statistical fit, all variables were firstly taken into consideration, and insignificant variables were eliminated in order.Normal, uniform, and lognormal distributions were considered for the functional form of the random parameter density functions described in (4) and, among them, normal distributions provided the best statistical fit.The derived statistical models showed improvement over the baseline fixed parameter negative binomial model, with an improvement in likelihood from −16,959.74 to −5,719.92 for segments with lighting systems and from −23,444.89 to −9,277.67 for segments without a lighting system.The overdispersion parameter is statistically significant in both cases, which indicates that the negative binomial model is more suitable than the Poisson model in this study.
A random parameter is valid when both the mean and standard deviation of the parameter density are statistically significant.If the estimated standard deviation of the parameter is not statistically different from zero, the parameter is constant across the observations.In this sense, nine variables showed statistical significance in segments with lighting systems.Out of them, three variables including segment length, number of segments with a low number of lanes, and proportion of segments with low left shoulder width have random parameters, while the remaining six variables can be regarded as fixed parameters.On the other hand, the results for eleven variables showed statistical significance in segments without lighting systems.Among them, seven variables, including average daily traffic volume, segment length, proportion of segments with a low number of lanes, proportion of segments with a standard number of lanes, proportion of segments with low left shoulder width, proportion of segments with standard left shoulder width, proportion of segments with standard right shoulder width, and proportion of segments with low right shoulder width, were identified as random parameters.A more detailed description of each variable is provided for model with lighting systems (Table 3), model without lighting systems (Table 4), and for marginal effects/elasticities (Table 5) based on the characteristics of each variable.
As can be seen in the estimation results for lighting systems (Table 3), the traffic volume and length of a segment positively influence traffic crash occurrences.This is consistent with most previous studies, which showed that greater road exposure results in higher traffic crash frequencies [2,21].A 1% increase in traffic volume translates into a 1.305% increase in average traffic accident frequency, which is an indication of elasticity.The segment length has a mean of 0.075 and a standard deviation of 0.073 when using a normal distribution.This means that the effect of segment length yields a decrease in accident frequency in 15.13% of the segments, and the remaining 84.87% of segments show an increase in traffic accidents.
The number of lanes in a segment is another variable associated with exposure, which also includes traffic volume and length of a segment mentioned above.All segments with a standard or low number of lanes had an increased impact on traffic accidents.The proportion of segments with a low number of lanes showed a mean of 0.094 and a standard deviation of 0.281, which is associated with higher crash frequencies in 53.13% of the segments, whereas it is associated with fewer crashes in 46.87% of segments.
Proportions of segments with a low left shoulder width had a positive impact on crash frequency, and a standard left shoulder width had an impact on traffic crash reductions.In particular, the standard left shoulder width had a random parameter with a mean value of −0.020 and a 0.009 standard deviation.A standard left shoulder width is associated with fewer traffic accidents in 98.2% of the segments and more accidents in the remaining 1.8% of segments.This result suggests that construction of left shoulders with standard width can help reduce traffic accidents.The proportion of segments with standard right shoulder width had a fixed parameter associated with fewer traffic accidents.This result might show the importance of standard shoulder width, which provides a more comfortable driving environment and helps reduce accident frequency.
Finally, more horizontal and vertical curves are associated with a higher number of traffic accidents.In terms of marginal effects, the addition of one more curve impacts traffic accidents by 0.58 for horizontal curves and 0.42 for vertical curves.
The estimated models in segments without lighting systems showed that traffic volume and segment length had a positive impact on traffic accident occurrences, and both variables were found to have random parameters.The mean value and standard deviation for traffic volume were 0.076 and 0.074, respectively, indicating that 15.33% of segments were associated with fewer accidents, and the remaining 84.67% were associated with more accidents.In terms of elasticity, a 1% increase in traffic volume was associated with a 0.076% increase in traffic accidents.Segment length had a random parameter with a 0.042 mean value and a 0.179 standard deviation.This indicates that 40.6% of segments were associated with fewer accidents, and 59.4% of segments were associated with more accidents.A 1% increase in segment length was associated with a 0.179% increase in traffic accidents, which is an indication of inelasticity.
The proportions of segments with both standard and low number of lanes were statistically significant, and these were associated with increases in traffic crashes.This is presumably due to the fact that they were related to exposure, as are segments with lighting systems.The proportion of segments with a low number of lanes had a random parameter, which indicated that 47.65% of segments were associated with fewer accidents and 52.35% of segments were associated with more accidents (mean value of 0.011, standard deviation of 0.185).The left shoulder width variable showed statistical significance for low and standard shoulder widths with random parameters in both cases.The mean and standard deviation of the low left shoulder width were 0.035 and 0.399, respectively, and 46.54% of the segments were associated with fewer crashes, while 53.46% were associated with more crashes.In contrast, the standard left shoulder width was associated with fewer crashes in a majority of segments.
Right shoulder width was statistically significant in all cases (low, standard, and high right shoulder width).The mean and standard deviation were −0.059 and 0.033, respectively, in standard left shoulder width case, which showed that 96.27% of segments were associated with fewer traffic crashes, while 3.73% were associated with more traffic crashes.Data for high right shoulder width suggested that a 1% increase had a negligible impact (−0.057%) on traffic accident reduction.Low right shoulder width was associated with fewer traffic crashes in 45.65% of the segments and with more traffic crashes in 54.35% of segments (mean value of 0.043, standard deviation of 0.389).
Finally, more horizontal and vertical curves were associated with increased traffic crashes.In terms of marginal effect, the addition of a curve in a segment was associated with 0.17 (horizontal curve) and 0.10 (vertical curve) increases in traffic crashes.
Taken together, exposure-related variables are associated with increased traffic crashes, so lighting system is not highly recommended where high exposure rates are expected.From this result, it would be necessary to establish the criterion of exposure rate of low, medium, or high in the future study.
It was found that on the segments where left and right shoulder widths are provided the impact of lighting system on accident frequencies is negligible when comparing the elasticity values.However, recommended shoulder width segments expected the lower traffic accident frequencies than low shoulder width segments.
The results on the curved segments showed that lighting system did not help to improve safety, meaning that in terms of marginal effect the magnitude of influence of lighting systems on occurrence of crashes was lower at segments without lighting than segments with lighting systems.

Conclusions and Recommendations
Through this study, crash occurrences on interstate highways in Washington State, USA, given two different road lighting conditions, with and without a lighting system, were analyzed and compared using an advanced statistical process based on random parameters modeling approaches.The findings from this study are as follows: First, variables related to exposure including segment length and traffic volume were associated with increases in traffic accidents.However, it resulted in the fact that those variables had different influence on crash in the two different road lighting conditions.In highway segments with lighting systems, segment length was associated with a random parameter.On the other hand, segment length and traffic volume had random parameters in segments without lighting systems.Through these results, it was found that the headlights of other vehicles can interfere with visibility or drowsiness during night driving which is not implicitly accounted for in the data (unobserved heterogeneity).The number of lanes, including situations with a standard number of lanes and a low number of lanes (random parameters), was associated with more traffic crash occurrences.In conclusion, on segments with high exposure rate, the lighting system is not highly recommended in terms of accident frequency reductions.For the consideration of lighting systems installation, another research on the standard/criteria setting of high exposure rate in Korea would be needed.
Also, it was found that although the lighting systems on the segment with shoulder width had little impact on accident frequencies by comparing elasticity values, accident frequency reduction is expected on segments with recommended shoulder width.To reduce accident frequency, providing more proportions of segments with recommended shoulder width will be necessary where space is allowed.
In terms of curves, segments having lighting systems are expected to have more accidents than segments having no lighting systems.Curve segments are more vulnerable to headlights of vehicle in counter lanes or following vehicle compared to straight segments.As such, more studies to identify the relationship between illumination/luminance of road lighting systems, impact of headlight, and more detailed geometric elements of curve will be needed.
Although this study explored the impact of lighting systems on accident frequencies, some caveats still remain.First of all, as it is known that the lighting is one of the most effective countermeasures for safety increase, however, it could be an obstacle for fixed-object accident, typically utility pole for light.Therefore lighting can have both positive and negative impacts on crashes as can be seen in the model results.If the types of crashes are considered, this issue can be solved through future studies.In a similar context, though lighting systems are generally known to improve visibility for drivers, lighting can disturb visibility with the headlight of a vehicle from the opposite directional lane or same directional lane.Therefore, an appropriate lighting system design guide must account for the specific road geometric conditions with the degree of lighting, traffic flows, and various lighting types such as the left, right, or both sides of driving lanes on the road.
In terms of modeling perspectives, temporal stability that this study did not consider needs to be analyzed which may have substantial implications for the Highway Safety Manual [22].
Finally, because this study used US interstate highway data and driver behavior and driving conditions may be different in other countries, the use of the results from this study would better be more cautious.

Table 1 :
Descriptive statistics of variables.

Table 2 :
Recommendation for geometric standard design criteria., and that of the segments without lighting systems is 2.09 miles.The corresponding average number of traffic accidents is 13.6 and 8.7 per year, respectively.The mean directional daily traffic volume is 17,800 for segments with lighting systems and 10,569 for segments without a lighting system.The logarithm form is used to develop the model for segment length and traffic volume.

Table 3 :
Model results for segments with lighting systems.

Table 4 :
Model results for segments without lighting systems.