Influence of Environmental Factors on Injury Severity Using Ordered Logit Regression Model in Limpopo Province, South Africa

Globally, road traffic accidents are a major cause of death and severe injuries. It is estimated that the number of deaths on the world's roads at 1.5 million per annum puts road traffic injuries as the eighth leading cause of death globally. Understanding the influence of environmental factors on deaths and severe injuries will help in policy-making and the development of strategies in Limpopo Province. We, therefore, aim to study environmental factors that influence road deaths and severe injuries and to identify whether their impact on injury severity levels varies. The study was based on secondary data on road traffic accidents obtained from the Department of Roads and Transport in Limpopo Province. The data comprised 18 029 road traffic accidents for the period January 2009–December 2015. The study found that weekends (Saturdays and Sundays) had the highest number of accidents when compared to weekdays. The proportion of observations in each severity level was not constant across explanatory variables. The generalized ordered logit regression (GOLR) models seemed to be an effective predicting model that can be adapted to determine the influence of environmental factors on injury severity compared to the ordered logit regression (OLR) model. The results of the GOLR model suggest that environmental factors such as slippery road conditions, rainy weather, and spring season lower the likelihood of severe crash occurrence. On the other hand, poor or defective road surface, time interval (6 a.m. to 11 p.m.), and provincial roads have a higher likelihood of severe crash occurrence. To decrease the severity of injuries in the province, provincial roadways must be maintained regularly.


Introduction
Globally, Road Traffic Accidents (RTAs) are a major cause of death and severe injuries. In its 2018 report, the World Health Organization (WHO) estimated the number of deaths on the world's roads at 1.5 million per annum and put road traffic injuries as the eighth leading cause of death globally [1]. It also stated that between 20 and 50 million more people suffer nonfatal injuries, with many incurring a disability as a result of their injuries. e social and economic costs of deaths and injuries due to RTAs are over US$100 billion annually [2]. Also, RTAs were seen to be some of the main leading causes of death among people under the age of 40 years [3].
Regionally, Africa, with only 2% of the world's vehicles, is the least motorized region of the world but contributes 16% of the global traffic deaths with Nigeria and South Africa contributing a significant proportion of these deaths [1]. Overall, young Africans aged between 15 and 29 years are most vulnerable to these RTAs [1]. e burden of Road Traffic Deaths (RTDs) is reported to be on the rise in developing countries such as South Africa despite increased efforts to address these traffic accidents. e 2016 report of the Road Traffic Management Corporation states that the number of RTDs increased by 10% between the years 2014 and 2015 [4].
is annual road carnage costs the South African economy approximately R43 billion, of which 60% is due to damage to vehicles and other properties [5,6].
In 2015, 10 613 fatal accidents occurred between January and December 2015 and resulted in 12 944 fatalities in the country with a rate severity of 1.2 [7]. Previous studies in this field have identified various factors that are associated with RTAs and RTDs. ese include, among others, sociodemographics such as the age of the driver, gender, speeding, not wearing a seat belt, vision deficiency that is corrected with the eyeglass, dark but lighted conditions (07:30 pm to 05:30 am), rainy seasons, drivers under the influence of drugs or alcohol, drivers exhibiting aggressive driving behaviour, and negative road engineering factors as well as quality of pavement [8][9][10][11][12][13].
To the best of our knowledge, information on the influence of environmental factors on death and severity of injuries on Limpopo Province roadway highways is still limited. It is, therefore, in this context that the present study aimed to contribute to the literature on environmental risk factors associated with deaths and severe injuries and to identify whether their impact on injury severity levels varies. In doing so, we use the ordered logit regression model (OLR) and will consider some extensions with the ultimate goal of producing a robust model for modeling death and severity injuries in the province. Researching on the impact of environmental factors on death and severe injuries in Limpopo Province of South Africa is of particular relevance to government and public health in the development of strategies to reduce road deaths and severe injuries.

Literature Review
A previous study conducted in South Africa by Vogel and Bester classified risk factors influencing road traffic crashes as human, vehicle, and road and environment factors [14]. Human factors included negligence, excess speed, dangerous overtaking, pedestrians in the road, and inconsiderate driving behaviour. Vehicle factors were mostly about faulty brakes and tyres. Road and environmental factors included rush-hour traffic and inadequate facilities for pedestrians. e study found that human factors accounted for 75% of the accidents, followed by environmental (15%) and vehicle factors (10%) [14].
Various studies have shown that environmental factors were significantly related to risk road fatalities. A study conducted by Jung et al. in southeastern Wisconsin assessed the effects of rainfall on the severity of single-vehicle accidents, taking into account weather-related factors such as estimated rainfall intensity for 15 minutes before accident occurrence, water film depth, temperature, wind speed/direction, stopping sight distance, and car-following distance at the time of the crash [15]. e study found that rainfall intensity, wind speed, and horizontal or vertical curves were all linked to an increased likelihood of accident severity in rainy weather. More accidents commonly occurred inroads with two-way traffic as opposed to single-way roads [16]. Furthermore, the risk decreases with the width of the road. Moreover, studies have found a link between road accident frequency and risk factors such as road segment length, width, number of ramps and bridges, horizontal and vertical curves, and shoulder-width [17].
A high proportion of road accidents and deaths occur during the night between 6 p.m. and 6 a.m. and during peak times between 12 p.m. and 6 p.m. [12]. Some accidents are caused by a lack of street lights, particularly during nighttime driving on undivided 2-lane or 2-way rural highways [18].
is could lead to difficulty in distinguishing the lane separation which might cause an accident. e probability of fatality is estimated to rise when dull lighting conditions are present [13]. Multivehicle accidents commonly occur in offpeak hours during the daytime [8].
Modeling of traffic accident injury severity is complex and has received more attention in recent years [19]. Several statistical models have been used to estimate the severity of traffic injuries. A study done in New Mexico from 2010 to 2011 used multinomial logit regression models to investigate the characteristics that differentiate teenage and adult drivers in intersection-related accidents [20]. However, it should be emphasized that the injury severity response variable has the nature of the scale that underlies the items (no injury, minor injury, severe injury, fatality) which renders the multinomial logit regression model unsuitable for analyzing injury severity. A recent study conducted in China investigating the statistical distribution characteristics such as types of environmental properties and road properties deployed an ordered logit regression (OLR) model to account for the unobserved heterogeneity across observations and also cater for choices that have an inherent order to them [21]. OLR models are among the most common ordinal regression techniques; however, they often have serious shortcomings [22]. ese approaches frequently violate the proportionate odds/parallel lines assumptions. Model misspecification can cause issues that are worse than the ones that these strategies were designed to address. One of the alternatives to address violated proportional assumption is to use generalized ordered logit regression (GOLR) models as they may emphasize a proportion of observations in each level of the response if not consistent across each level of the explanatory variable [22].

Study Setting, Population, and Data Collection.
is study is based on secondary data on RTAs obtained from the Limpopo Provincial Department of Roads and Transport. Limpopo is the northernmost province of South Africa with an estimated population of 5 million. It comprises five districts, namely, Capricorn, Mopani, Sekhukhune, Vhembe, and Waterberg. e data comprised 18 029 RTAs over the period 2009 January-December 2015. A single record for each accident was created along with a set of variables indicating severity, time of the accident, locations, and the cause as shown in Table 1.

Statistical
Analysis. All data analyses were performed using SPSS version 26.0 (IBM SPSS Statistics) (IBM, Chicago, USA) and R software. Baseline characteristics of the RTAs were expressed as frequencies and percentages, and the Mantel-Haenszel extension test was used to test for linear trends in the injury severity level.

Ordered Logit Regression Model (OLR).
e ordered logit regression model (OLR) was applied to determine the relationship and determinants of death and severe injuries. Suppose Y denotes the ordinal outcome with j categories and µ is the corresponding conditional mean. e odds ratio of being less than or equal to particular j categories is given as Considering ordered logit function e OLR is defined as where a is the slope of the model, β is the coefficient of the model, i is the explanatory variable, and j is the severity level (property damage, minor injuries, serious injuries, and death). en β is assumed to be the same for all the explanatory variables.

Generalized Ordered Logit Regression (GOLR)
. e generalized ordered logit regression model (GOLR) was applied to determine the relationship and determinants of death and severe injuries. Suppose Y denotes the ordinal outcome with j categories and µ is the corresponding conditional mean. e odds ratio of being less than or equal to particular j categories is given as Considering ordered logit function e GOLR regression model is defined as where a is the slope of the model, β is the coefficient of the model, i is the explanatory variable, and j is the severity level (property damage, minor injuries, serious injuries, and deaths). en β is not the same for all the explanatory variables.

K-Means
Clustering. K-means clustering is a popular method for cluster analysis in data mining. It partitions n observations into K clusters in which each observation belongs to the cluster with the nearest mean. Given a set of observations (x 1 , x 2 , . . . , x n ), where each observation is a d-dimensional real vector, K means clustering aims to partition the observation into K ( ≤ n) sets to minimize the within-cluster sum of squares.

Brant Test.
e Brant test developed by Rollin Brant in 1990 was used in the study to assess whether the observed deviations from the OLR model are larger than what could be attributed to chance alone [23].
with success probability, 3.3. Wald-Type Goodness-of-Fit Statistic. where If χ 2 is found to be significant, individual differences β j − β l may be considered concerning their approximate standard errors to elucidate the nature of the lack of fit.  e graph shows that weekends (Saturdays and Sundays) had the largest number of RTAs when compared to weekdays. e proportion of property damage, minor injuries, serious injuries, and mortality caused by an animal in a roadway was 82%, 83%, 79%, and 65%, respectively (shown in Table 2). e majority of fatal crashes occurred between 2: 00 and 11:00 p.m., with 75.63% of property damage, 74.60% of minor injuries, 73.42% of serious injuries, and 55.41% of fatalities within those hours, shown in Table 2. e majority of traffic fatalities and injuries in this province happened on regional roads in Capricorn District. ere was a significant linear trend observed for all the variables. Table 3 shows the results of the ordered logit regression for associations between severity and related environmental factors in Limpopo Province, South Africa. Using a significance level of 0.05, the model findings show an association between environmental conditions, hour interval, road type, and region with the severity of injuries. Driving on roads with potholes, the odds of being more likely severe is 2.59 times higher than driving on animal-infested roadway, holding constant all other variables. e odds of being less severe is 0.67 lower than driving between 2:00 pm and 11:00 pm as compared to driving between 12:00 pm and 5:00 am. Driving on a provincial road, the odds are 0.30 less severe as compared to driving on districts roads holding all other variables constant. Driving in Mopani, Sekhukhune, Vhembe, and Waterberg districts is more likely severe than driving in Capricorn district.

Results
e OLR model is based on the assumption that each independent variable has the same influence at each cumulative split of the ordered dependent variable. To check the model's adequacy and proportional odds assumption, the Hosmer-Lemeshow goodness-of-fit and Brant tests were performed. e goodness-of-fit test revealed that the model was well-fitting (likelihood ratio statistic � 5.88, degree of freedom � 9, P value � 0.7514). Brant's conclusion shows that the data did not meet the parallel lines assumption (chisquare � 74.04, degree of freedom � 38, P value � 0.0004), indicating that fitting the OLR model to the data was unsuccessful (shown in Table 4). As a result, in order to accommodate the proportionality constraint, we explore fitting the GOLR model. e model-fitting results are shown in Table 5. Table 5 shows the results of the GOLR model for associations between deaths and severe injuries and related environmental factors in Limpopo Province, South Africa. Using a significance level of 0.05, the model findings show an association between environmental conditions, hour interval, season, road type, and region with severity. Driving on a slippery road is less likely to result in death and severe injuries than driving on animal inroads, suggesting that vehicle crashes are less likely to result in death and severe injuries when driving on a slippery road than when driving on the animal inroads. e odds of a car crash due to a stationary or parked vehicle on the road are 0.31 and 0.41, suggesting that vehicle crashes are less likely to result in minor injuries and death, respectively, compared to the odds of vehicle crashes due to animals on the roads. Vehicle crashes on poor or defective road surfaces were more likely to result in death and serious injuries. Driving in rainy weather is less likely to result in death and serious injuries than vehicle crashes due to animals on the roads. K-means clustering was used as a clustering method for splitting the time of day into a set of k groups. We have grouped the time of day into 3 groups (00:00 a.m.-05:00 a.m., 06:00 a.m.-01:00 p.m., and 02:00 p.m.-11:00 p.m.) using the unsupervised k-means algorithm. e study results showed that the odds of a vehicle crash during 06-13 and 14-23 hours are 1.52 and 2.78, respectively, suggesting that vehicle crashes are more likely to result in deaths and severe injuries compared to the odds of a vehicle crash during 00-05 hours. Vehicle crashes during spring are less likely to result in deaths and severe injuries than driving during the autumn season. is suggests that vehicle crashes are less likely to result in deaths and severe injuries when driving in the spring season than during autumn. Vehicle crashes during winter were less likely to result in deaths against nondeaths.
Vehicle crashes on national roads are less likely to result in deaths and serious injuries than on district roads. Driving on provincial roads plays an important role in distinguishing the severity of vehicle crashes (death, minor, and serious injuries) from vehicle crashes that did not result in property damage but do not play a significant role in distinguishing the property damage and injuries from vehicle crashes that resulted in deaths. Regional roads were less likely to result in deaths and serious injuries as compared to district roads. Vehicle crashes in Mopani and Waterberg districts were less likely to result in deaths and injuries than in Capricorn district, suggesting that vehicle crashes are less likely to result in deaths and injuries when driving in Mopani and Waterberg districts than in Capricorn district. Vehicle crashes in Waterberg district were less likely to result in deaths against nondeaths as compared to Capricorn district.

Discussion
A study by Malin et al. found an increase in accident risks for poor road weather conditions such as heavy rain and slippery road conditions [15,24,25]. is might be because driving in rainy weather affects the driver's sight, the vehicle's traction, and the risk of an accident increases [15,25,26]. is is similar to the findings of the current study, which discovered that slippery road and rainy weather conditions were associated with deaths and severe injuries. Driving in rainy weather makes it difficult for drivers to maintain control of their vehicles since the road becomes more slippery [23]. Even though the majority of accidents occur due to bad weather [24][25][26], this study found that rain, slippery road conditions, and a stationary or parked vehicle significantly reduce road injury severity. Previous research came to similar conclusions, indicating that rain, snowy or slippery roads, and congested roadways decreased severity   [27,28]. is might be due to the fact that rainy weather causes reduction in the daily traffic volume and driving speed [29]. Our study findings also revealed that a poor or defective road surface increases the risk of death and serious injuries. Similar to earlier research, it was discovered that poor pavement conditions were linked to proportionally more severe injuries and that very poor pavement conditions were associated with fewer severe crashes [30,31].
Our study also revealed that the hours of 06 a.m. to 1 p.m. and 2 to 11 p.m. significantly increased the likelihood of serious injuries to death when compared to the hours of 12 p.m. to 5 a.m., which may be attributed to the fact that the majority of individuals are going to and from various workplaces and schools. According to a study conducted by Meng (2017) estimating crash severity on mountainous freeways in Chongqing, accidents occurring between 19 and 24 p.m. were found to be more severe because of the driver's visual response, psychological load, and the road environment. e study also found that the degree of mutual adaptation is more unfavourable to the driver [32]. It was also stated that the likelihood of major traffic accidents is higher in summer and autumn than in the spring and winter. However, similar to the study findings, the likelihood of road severity against property damage decreased in spring compared to autumn, and the chance of death against nondeath decreased in winter season as compared to autumn. Summer was not found to be significantly associated with the severity of injuries.
It was reported in previous studies that minor and serious accidents are more frequent in urban areas, whereas fatal accidents are more likely in rural areas [33,34]. is study found that national and provincial roads significantly reduce injury severity as compared to district roads. National roads are routes connecting major cities; provincial roads connect smaller cities and towns to the national route network; regional roads connect smaller towns to the route network; and district roads are mostly found in the rural areas where they connect market centres to provincial roads. Furthermore, in this study, it was revealed that Sekhukhune and Waterberg districts were less severe compared to Capricorn district. is might be due to the fact that Capricorn is a more developed district, and the capital city is situated in this region. Mopani and Vhembe were found not to be significantly associated with deaths and severe injuries.
e findings further showed that most road accidents in the province occurred during the weekend. is might be due to the increased congestion and traffic volume as a result of weekend travel. Contrary to this finding, Sangkharat et al. found a smaller number of road accidents on weekends compared to weekdays [35]. e disparity in the results might be attributed to weekend activities and traffic flow between these two countries ( ailand and South Africa).
ere was a significant linear trend for all the variables, suggesting that road accidents due to environmental factors vary according to injury severity. is is similar to findings in previous studies in which it was observed that rain, snowy or slippery roads, and busy highways reduce the severity [27,28,36].
Considering several strengths of the study, including the large sample size, limitations should be noted. e study used a secondary dataset collected and recorded by the Limpopo Provincial Department of Roads and Transport. It is noted that some important third variables were not available in the dataset, such as demographics (age, sex, race, and ethnicity), and road conditions such as road segment  length, width, and number of ramps and bridges; therefore, when the omitted variables are significant covariates for injury severity, this might result in residual confounding. Moreover, another limitation comes from considering severe crashes from only one province.

Conclusions
e main purpose of this study was to determine the influence of environmental factors on injury severity and to identify whether their impact on injury severity levels varies. It was found in the study that the underlying assumption of the OLR model was violated to the extent that the relationship between each pair of outcome groups (property damage, minor, serious, and death) is not the same. e GOLR model seemed to be an effective predicting model that can be adapted to determine the influence of environmental factors on injury severity. e results of the present study suggest that environmental factors such as slippery road conditions, rainy weather, and spring season lower the likelihood of severe crash occurrence. On the other hand, poor or defective road surfaces, time intervals (6 a.m. to 11 p.m.), and provincial roads have a higher likelihood of severe crash occurrence. Finally, it may be concluded that frequent road maintenance on provincial roads is required.
Data Availability e data supporting the findings of the article are available from the corresponding author upon reasonable request.

Disclosure
is paper was taken out of a master's thesis written by the author [36].

Conflicts of Interest
e authors declare that they have no conflicts of interest.