Identifying the Factors Contributing to Injury Severity in Work Zone Rear-End Crashes

Egypt’s National Road Project is a large infrastructure project aiming to upgrade the existing network of 2500 kilometers as well as constructing new roads of 4000 kilometers tomeet today’s need. Increasing highwaywork zones eventually direct the challenges for traffic safety andmobility. Realizing the need for mitigating the impact of such a challenging scenario, this paper aims to investigate and identify the factors of work zone rear-end crash severity. In this regard, a random parameter ordered probit model was applied to analyze data on the Egyptian long-term highway work zone projects during the period of 2010 to 2017. The factors of speeding and foggy weather conditions are found to be the key indicators for modeling the random parameters. Besides, during the weekend and at nighttime, there is a higher risk of rear-end crash in work zones, while heavy and passenger vehicles are at greater risk in this regard. It is anticipated that the findings of this study would facilitate transport agencies in developing effective measures to ensure safe mobility across work zones.


Introduction
Approximately 10,466 people in Egypt died in road crashes in the year 2013.Such a higher proportion of road traffic fatality highlights the critical nature of road safety in Egypt.Young and middle-aged individuals have been the most vulnerable age groups in this regard, which eventually renders severe impact on the society and on the emerging national economy of Egypt as well [1,2].
In the National Roads Project, more than 4000 kilometers of new roads are currently being constructed to strengthen the Egyptian road network.In addition, another 2500 kilometers of existing road networks are reportedly being upgraded [3], which has directly led to an increase in the number of work zones.Consequently, these work zones impede the traffic and create conflicting situations for traffic flow and construction activities.
Rear-end crashes are among the most common crash types on the highways, and the most concerning scenario is the alarming number of the corresponding injuries and fatalities.For instance, it has been showed that rear-end collisions constitute 30% of all injuries and 29.7% of all property damage in the USA [4].Additionally, it is also argued that rear-end crashes mostly occur on highway work zones rather than nonwork zones [5][6][7][8][9].
It is observed that Egypt significantly lacks in published data that would define the severity of rear-end crashes and also the relationship of these crashes with work zones.Accordingly, the academicians, practitioners, and the government agencies need to put collaborative efforts in this regard, as identification and investigation of the critical factors contributing to work zone rear-end crashes will facilitate developing the appropriate and effective countermeasures that would serve the purpose of controlling the increasing rate of road safety issues across the highway.

Literature Review
Having a good knowledge of the factors related to work zone rear-end crashes is essential to reassure effective and efficient work zone safety.It is carried out by reviewing the findings of previous studies, focusing the influential factors of the work zone and rear-end crashes with respect to their severity and frequency and also the statistical injury severity models used by the researchers were also studied in order to determine an appropriate injury severity model.The section below underpins the findings from past studies in this regard.
In terms of work zone crashes, Zhang et al. developed a hybrid approach that combines a factor analysis method and an ordered probit model to carry out a comprehensive analysis of work-zone crashes.The results showed that the crash type factor was significantly associated with workzone severity [10].With a similar approach, Osman et al. used the ordered probit and logit models to identify factors contributing to large truck crash injuries in work zones and discovered that daytime, speeding, and rural areas were associated with more severe injuries [11].Also, the ordered probit model was developed by Ghasemzadeh and Ahmed to investigate the effect of weather on the severity of work zone crashes.The researchers concluded that weather and lighting conditions were the most important factors influencing crash severity at work zones [12].In another study by Bharadwaj et al., driving behavior was found to be the most critical risk factor in work zone crashes [13].Wei et al. investigated work zone crash severity under different light conditions, showing that the combination of factors nighttime, high speed, and driving under the influence or in poor lighting conditions leads to an increase in the injury rate of 72.7% [14].On the other hand, Sze and Song examined the level of association between crash severity and common factors of work-zone-related crashes by applying a multinomial logistic regression model.The authors concluded that the factors of the vulnerability of road users, heavy vehicles, and the daytime were significantly related to the severity of injuries in work zone crashes [15].Long et al. conducted a study to examine the major factors contributing to work zone crashes.Based on the study results, the rear-end crash type was found to be the most significant factor, as it tends to intensify the crash severities [16].According to Harb et al., the road type, age, gender, weather and lighting condition, and drugs and alcohol involvement were substantial risk factors that influence work-zone crashes [17].
The multiple logistic regressions were used by Yan et al. to investigate risk factors of rear-end crashes on major roads having signalized intersections.The result identified seven environmental factors strongly associated with rearend crash risk [18].In the research work of Wu et al., speeding of the following and the leading vehicles, differing amounts of headway, and the density of the fog were significantly related to risks in rear-end collisions [19].Mohamed et al. inferred that there are seven variables causing substantial risk factors that influence rear-end crashes, i.e., speed, driver experience, road type, a number of lanes, etc. [20].On the same note, Yan and Radwan concluded that rear-end crashes occurring at signalized intersections are associated with higher speed limits, daytime, and wet and slippery road surface conditions [21].A study by Li et al. was conducted to evaluate how explanatory variables affect collision risks for three different types of collisions at diverging freeways.The researchers indicated that the outcomes of rear-end crashes were more serious than other types of collisions [22].
In terms of work zone rear-end crashes, a plethora of literature report the observation of an increased rate of rear-end crashes in work zones compared to nonwork zones [5][6][7][8][9].Qi et al. investigated rear-end crashes in the work zones and utilized the ordered probit model.However, the research method is noted to have certain lacking, like the consideration of the drivers' gender and age, vehicle characteristics, and weather and lighting conditions [23].Silverstein et al. conducted a study using the regression model of Negative binomial (NB) and also the model of multinomial logit (MNL) to estimate how different factors cause fatal crashes in both the work and the nonwork zones.The findings of work showed that rear-end crashes have a higher probability of causing death in work zones than nonwork zones [24].Likewise, the comparison of work zone rear-end crash scenario for Singapore and Beijing revealed that trucks were at a higher risk of suffering from rear-end crashes, in particular when the heavy vehicle is leading [25].Meng and Weng also suggested that the percentage of heavy vehicles influences the frequency of rear-end crashes in work zone [26].
In predicting the injury levels for collisions, the ordered probit model was utilized to examine the different factors that contribute to severe injury crashes [15,27,28].Abdel-Aty applied the same technique to discover the relationship between the critical factors causing injury severity in crashes on different roadway sections.The author stressed the significance of this model to measure injury severity from crashes since it produced the best results while maintaining simplicity [16].Another study signified the efficacy of this model to investigate the severity of crashes [17].Similarly, another research work also employed the ordered probit model to investigate the different risk factors and also the severity levels of injury sustained in single and two-vehicle collisions.[18].
The random parameter ordered probit model is a generalization of the traditional ordered probit model allowing random regression coefficients, thereby capturing effects caused by differences in unobserved variables.The random errors of the regression parameters are assumed distributed according to a priori distribution, often chosen to be uniform, triangular, or normal.Predictions based on random parameter ordered probit regression can be expected to be more accurate and statistically superiority than the results from the standard model [19][20][21][22][23].
Research results based on data from western countries may not be directly applicable to developing countries such as Egypt due to differences in roadway designs, traffic characteristics, and driver behavior.Thereby, the current study aims to identify the factors that have a significant impact on the injury severity of vehicle occupants that are involved in work zone rear-end crashes.Moreover, the impact of the identified factors on injury severity based on available Egyptian traffic data is also investigated by utilizing a random parameter ordered probit model.To the authors' knowledge, no research on work zone rear-end crashes in Egypt has been published to date.The present study is therefore an attempt to bridge this knowledge gap.

Data Collection
In Egypt, the Ministry of the Interior has a traffic department, whose key role is managing the database of national road crashes.For crashes that occur on federal highways, the Ministry of Transport regularly collects crash data.The paper investigates work zone rear-end crashes that have occurred in 12 highway maintenance and rehabilitation long-term projects (with a duration greater than one-year), during the period of 2010 to 2017.In this regard, a total of 1045 crash reports were identified within the studied period.Crash variables extracted from the database were classified into six categories, including information of the driver, vehicle information, time of the crash, characteristics of the road, work-zone information, and environmental conditions.Since the level of injury is ordinal in nature, the injury severity variable was classified into three categorical levels, including no injury, injury, and fatal crashes.In the current study, the severity of the crash was identified on the basis of the highest injury severity sustained.For instance, in the case of one fatality (at least), it is termed as a fatal crash.Similarly, an injury (one at least) resulting from a crash is classified as an injury.Closing off highways to traffic while maintenance and rehabilitation work is ongoing is very difficult.Sometimes half of the road has to be open to traffic during working on another half.Since this situation is inevitable, this paper has taken into account the types of surface construction to uncover which surface conditions contribute to the rearend crashes in work zones.In this regard, the type of surface construction for each crash is divided into five categories reflecting the situations of highway surfaces (Asphalt, Milling, Concert, Removing Asphalt, and Base) surface.The descriptive statistics and frequency distribution of the factors included in the analysis are reported in Table 1.

Methodology
The random-parameter ordered probit model is especially appropriate for investigating how levels of injury depend on circumstantial factors.The randomness of the parameters provides compensation for unknown latent variables, accounting for heterogeneity in the predictions of the fixedparameter model.In order to study the rear-end crash data, we apply the model where i = 1, 2, . . ., n is the index of observations, y *  is the dependent variable for observation i, x i is a vector of covariates,  are the mean parameter values, and   is an error term assumed to be distributed as a standard normal random variable,   ∼ N(0, 1).
We note that when y *  is a binary variable and the parameters   =  are fixed (nonrandom), we have the traditional probit model; when y *  is an ordered variable with  categories and   are fixed, the model is an ordered probit.
The probability density function for the ordered probit model is where   are the threshold values for the ordinals.In (2), the parameter vector   is allowed to be different for each observation i, so that the marginal effects on the dependent variable differ in the sample.The general assumption on the parameter vector is that it is drawn from some distribution (, ), where the vector  are the parameters of the apriori determined distribution, often chosen as uniform, triangual, or normal.We here assume that   is normally distributed, that is,   ∼ (,  2  ) for each component   in   , which generalizes the model to a random-parameter ordered probit model.In case all   = 0, the model reduces to the fixed-parameter ordered probit.The estimation of the fixedparameter vectors in the ordered probit model is performed by likelihood maximization (ML).In the random parameter case, it is necessary to resort to the simulated maximum likelihood (SML) method.
In the random-parameter ordered probit model, we need to estimate the two parameter vectors   and .Since   is not observable, we integrate out   from the conditional distribution (2) to obtain However, (4) has no closed-form solution, and so is solved by Monte Carlo integration, yielding an approximation P () used as the factor in the maximum likelihood function.For any given parameter vector , a sample value   of the parameter vector is obtained in draw r from the assumed distribution with density (, ), from which P () is calculated for observation i using for a total number of samples R. The simulated maximum likelihood estimator   is chosen as It can be shown that the SML estimator is consistent and asymptotically normal under some regularity conditions.The performance of simulated maximum likelihood is dependent on a large number of samples, which can be very timeconsuming.In order to keep the number of draws reasonably low, the points are drawn from a Halton sequence, which has better coverage than pseudo-random number generators.In the Simulated Maximum Likelihood,  = 200 Halton draws were used which have been shown to give accurate parameter estimates [20,21].The I categories are determined from the thresholds and the probabilities of the ordered responses are given by the thresholds and the standard normal cumulative distribution function Φ as The marginal effects are computed as follows, with the sample mean for each category  as an argument (8) where  (⋅) is the probability density function of the standard normal distribution.

Results and Discussion
. .Model Specification Tests.In this study, the statistical software R with the package Rchoice was used for model parameter estimation.A total of 1045 observations were taken for the respective 25 independent variables.Each explanatory variable of the data set was first tested for multicollinearity on the basis of the Variance Inflation Factor test (VIF).VIF basically quantifies the change in variance or the extent of correlation among the predictors in a model.If the value of VIF is in the range of 5-10, the predictors are affirmed to have a high correlation between them and if VIF value >10 and there seems to exist multicollinearity affecting the estimation of regression coefficients [24].
In the current study, the VIF values were acquired in the range of 1.03-3.5, which informed that the explanatory variables had no concerns regarding multicollinearity.Accordingly, the least significant variable was removed using the procedure of backward elimination.The procedure continued until a final model was achieved.Thus, 13 variables from the model were having statistical significance with the confidence interval of 95%, while 'daytime' was the only variable having statistical significance with a confidence interval of 90%.Afterwards, the likelihood ratio test was used for testing the validity of the null hypothesis, i.e., the fixed-parameter model has statistical equivalence to the random parameters model.The method is illustrated in the following section, as adopted from the study of Washington et al. [25]: where   () = log-likelihood convergence (For Fixed model) and   () = log-likelihood convergence (For Random model).The resulting value of chi-square statistic, i.e., (X 2 = 41.4) with two degrees of freedom and over 99.99% distribution, confirmed the statistical significance and dominance of random-parameter model in comparison to the fixed-parameter model.Besides, the researchers have also used other methods for comparing the performance of the two models.These methods included "Bayesian information criterion -BIC", "Akaike information criterion -AIC", and "Pseudo−R 2 , taking into account that lower values of AIC and BIC are good while a higher value of Pseudo-R 2 indicate a better model fit.Accordingly, the AIC and BIC values of the random model were relatively lower than the fixed model, and the random model also acquired dominance over the fixed model with a relatively higher value of Pseudo-R 2 (i.e., fixed model = 0.203 and the random model = 0.224) [26].These results indicate that the random parameter model describes the outcomes better than the fixed-parameter model.Followed by acquiring these empirical findings for the random model, the subsequent analysis is focused on the results summarized in Table 2. Consequently, Table 3 presents the marginal effects or the instantaneous rates of change for the study variables, i.e., the additional information about the categories of injury severity, the likeliness of occurrence, and also the change in the corresponding categories.

. . Model Estimation Results.
The results of the model indicate that, under the normal distribution, 'Speeding' and 'Foggy weather' variables were found to be best modeled as random, having statistically significant standard deviations.
Table 2 illustrates the mean and standard deviation values of these parameters.
For the 'Speeding' parameter, the normal distribution is confirmed from the mean value of 1.54 and 1.69 as the measure of its standard deviation.It informs that 18% distribution is less than 0. It can be interpreted that 82% of the high-speed vehicles across the work zone tend to increase the rate of fatal and injury crashes, while 18% of vehicles passing through the work zone and involved in rear-end crashes with high speed are less likely to sustain injury crashes.The other parameter of 'foggy weather condition' has secured the mean value of -0.98 with 1.04 value of standard deviation.This indicates that 82.7% of rear-end crashes that occurred during foggy weather conditions result in a decrease in possible injury crashes, while 17.3% of the crashes result in an increase in fatal and injury crashes.The following section discusses the findings specific to the categorized study variables.
. . .At-Fault Driver.The findings specific to the age group reflect that young drivers tend to have a higher probability of encountering fatal impact of rear-end crashes.An explanation might be that the young drivers tend to have lesser experience of driving, along with having high-speed driving attitude, which eventually leads them to experience the most severe crashes [29].Based on crash severity model by gender, it is noted that male drivers travelling through work zone tend to be more involved in rear-end fatal crashes as compared to female drivers.This finding is deemed to be reasonable for the unique attitude of male drivers to take more risks, drive over the speed-limit, and drive more aggressively.Accordingly, a number of previous studies [16,30,31] are in agreement with these findings.However, some studies [32,33] have argued this aspect, while claiming that similar circumstances tend to affect female drivers more than male drivers . . .Highway Geometry.Rear-end crashes have greatly been associated with road geometry.The positive coefficient of curved sections implies that rear-end crashes occurring on curved sections tend to render high injury severity.This result is in line with the previous research findings [34,35].In the same context, a considerable amount of disagreement is also observed, as the researchers have reported the horizontal curves on roads were significant towards decreased injury severities across work zones than the nonwork zones [9,36].However, the study by Katta found that the horizontal curves have insignificant impact on injury severity [37].
. . .Crash Information.'Weekend' and 'Nighttime' are found to be the key factors related to rear-end crashes, as higher severity is observed for the occurrences across all work zones as compared to those occurred during the weekdays and daytime.It may be supported by the fact that work zones are inactive during weekends, which motivates the drivers to drive at high speed, especially in nighttime since the work zones are expected to be usually not operational during the weekend.As a result, the probability of experiencing an eventful crash is higher in this case.The changing driving conditions at night with lower visibility and higher speeds made possible by lighter traffic should be considered as a number of factors jointly increasing the fatality rates and injury severity on the Egyptian highways.
On the other hand, the drivers commuting daily on weekdays during daytime have a better visual impression, more time to recognize work zones and to react accordingly.These drivers tend to be more aware of the danger and are better prepared to slow down or take other measures to reduce the crash risk.These findings are consistent with a number of previous researches [31,[38][39][40][41].However, Zhao and Garber found no major differences between the day and nighttime crashes in the work zone [42].Aside from the consideration of crash time, the factor of 'vehicular type' is also crucial with regards to work zone rear-end crashes.The results show that the heavy and passenger vehicles are positively associated with injury severity in rear-end crashes.Meanwhile, it is further realized that the involvement of passenger vehicle in a rear-end crash directs increased severity of injuries for the drivers and the occupants.More specifically, the outcomes of heavy vehicles involvement in rear-end crashes are more fatal, since such crashes lead to multiple-vehicle crashes as well.It leads to severe driver's injuries and multiple fatalities at the work zones simply because of reduced braking system capability.
The impact of heavy duty and passenger vehicles on work zone crash severity is found to be consistent with the findings of several earlier studies [43][44][45].It can be explained on the basis of the fact that Egypt's trucks characterize large volume and excessive weight as there is more than 96% transportation of good by trucks [2].
. . .Environment Related.The impact of foggy weather conditions is found to be significant in terms of causing work zone rear-end crashes.The result further indicates that the crashes occurring during foggy weather are not as severe as during other weather conditions.It is interpreted based on the fact that the reckless attitude of drivers is not notable during adverse weather period as compared to clear and dry weather; i.e., the drivers are intrinsically cautious while travelling in adverse weather conditions.However, some studies report the contrary findings that foggy condition increases the rate of fatal crashes [31,36].
In addition to the impact of foggy weather condition, the summer season has also acquired a significant association with the injury severity, causing an increase in the possibility of the injuries.These findings are supported by the fact that the number of vehicles on the road is higher in the summer season that eventually makes the passenger vehicles vulnerable to experience crash on the highways.A number of studies are found to be in agreement with this finding [46,47].
. . .Work Zone Information.Advanced information availability to drivers ahead of work zones like the speed limit, construction type, and the number of lane closures are also found to exhibit a significant relationship with the severity of work zone crashes.The respective variables are observed to have statistically positive values of the coefficients that inform that having inadequate information regarding work zones leads to increasing the likeliness of rear-end fatal injury crashes.In this context, Green and Senders presented the foremost reason causing road traffic crashes to be linked with the information flow.Like, when a driver does not perceive a critical situation earlier, the delay in driver perception-reaction time is a definite outcome [48].
In addition, the positive coefficient value of the variable "lane closures" reveals that the closure of many lanes will be associated with an increased rate of occurrence for fatal crashes.A likely reason is forced merging of traffic, which could be attempted late due to poor visibility, thereby leading to increased severity.In particular heavy vehicles can be difficult to maneuver in work zones if changing of lanes is commenced late.Consistent results were obtained in previous studies [38,44,45,49,50].
Another key element of consideration is construction type; i.e., during asphalt surface construction, vehicle occupants are more likely to be involved in rear-end work zone fatal crashes.High speeding has been known to be a strong contributory factor for high frequency of crash severity.The analysis results found a clear trend that the risk of rear-end crash is increased if the speed limit is exceeded.It can be explained based on the fact that high-speed driving combined with the scenario of following other vehicles too close tends to be the major contributory factor for work zone rear-end crashes.This finding is supported by a number of previous studies [38,39,44,51,52].
Hence, it can be asserted that the estimated coefficient values of the study variables facilitate determining the impact of the respective variables on the likeliness of the rear-end crash.In this regard, it can be inferred that 'speeding' factor affects the injury level with the greatest coefficient for a rearend crash that occurs at a work zone with  = 1.54, whereas 'heavy vehicle' factor seems to render the lowest risk ( = 0.288).
Besides, Table 3 also presents the marginal effects for each factor with regards to the specific level of injury severity and provides additional information to confirm the previous findings.Comparing the results for drivers' gender, if the male drivers are travelling across work zones, a significant decrease is observed for 'No Injury' category (i.e., 9.9%) while the probability of 'Injury' and 'Fatal Injury' is increased by 6.3% and 29.3%, respectively.In addition to this, the results of road class are also comparable; i.e., the likeliness of no injury crashes is decreased by 12.1%, and the probability of injury and fatal crashes increased by 7.7% and 35.9%, respectively, if rear-end crashes occurred in work zone rural areas.

Conclusion
The current study presents an analysis of injury severity of work zone rear-end crashes using a random-parameter ordered probit model.This approach allows for unobserved heterogeneity in the data, typically for the factors related to drivers, vehicles, and weather conditions.Based on Egyptian work zone crash database from 2010 to 2017, the modeling procedure indicates that the factors 'Speeding' and 'Foggy weather' are best modeled as random parameters.
On investigating the impact of different factors on severity of rear-end crashes, this study reveals important findings, passenger or heavy vehicles driving at high speed through work zones during nighttime with regards to the prospect of lane closures, and also affirms that unexpected maneuvers tend to have a greater chance of being involved in fatal rear-end crashes.In addition, the study also shows that young male drivers who travel at nighttime during the weekends tend to suffer more from fatal injuries.It is followed by another assertion that vehicle occupants are more likely to be involved in injury and fatal rear-end crashes in rural work zone area and horizontal curves.In terms of highway construction, the injury severity was higher during asphalt surface construction than during milling surface construction.
The finding from this analysis can be expected to facilitate transport agencies in the development of efficient measures to ensure safety at work zones, mitigate risk factors, and increase the traffic safety on Egypt's highways.
Since most crashes can be attributed to human error.For this reason, it is recommended to develop effective driver training programmes to offer regular skills training to prelicensed and post licensed drivers specifically instilling the attitude of being a responsible driver across work zones.
Furthermore, ITS technologies, such as variable speed limit (VSL) and dynamic message signs (DMS) at an appropriate distance ahead of the work zone, can provide efficient means to provide drivers with updated information.Providing dynamic information can mitigate the risks of adverse design, human factors, and roadway conditions, as well as balancing the traffic volumes between lanes, thus reducing lane changing maneuvers.
Other more conventional traffic engineering solutions include increasing the upstream distance from the work zone where illumination or fluorescent devices, cones, and barrels are placed, as well as enhancing visibility and reducing the speed limit.Also, the use of flashing lights can be efficient in making drivers reduce their speed.
The outcome of the current study is limited by available data, which may affect results and their interpretations.An example hereof is that police reports do not include detailed crash location within work zones (whether the incident occurred in the advance warning area, transition area, activity area or termination area).A limitation of the model in this study is that some potentially crucial information (such as traffic volume, number of vehicles) has not been considered since the data are missing from the database.It is therefore advisable that more detailed information be collected and entered into the database so that it could be used for further model calibration and a more detailed analysis of risk factors in different crash scenarios.

Table 1 :
Summary of descriptive statistics.

Table 2 :
Rear-end injury severity model results.

Table 3 :
Marginal effects associated with the random-parameters model.