The Role of Journey Purpose in Road Traffic Injuries: A Bayesian Network Approach

Introduction. Road trac injuries are now regarded as the eighth leading cause of death globally. For example, in 2016, 102,362 trac injuries took place in Spain in which 174,679 drivers suered injuries. ese ndings necessitated the development of the current study which focuses on the prime factors that cause this type of injuries. e aim of this study, therefore, is to explore the behavioral factors that entail a higher risk of suering either a serious or a fatal injury for drivers. Methods. e ndings are based on information and data provided by “Dirección General de Tráco” (DGT) in Spain on trac injuries that occurred in the year 2016. Reviewing a wide range of the literature, the authors identied the most inuential variables and created a model using the Bayesian networks. e variables that dene the model are grouped into four factors: vehicle factor, road factor, circumstantial factor and human factor. Results. e results suggest that the principal variables that determine a higher probability of serious or fatal injuries in trac injuries are: lack of using appropriate safety accessories, high-speed violations, distractions as well as errors. Finally, the research shows the severity probability based on reason of displacement (“in itinere,” on business, or in leisure).


Introduction
Road tra c injuries are one of the main causes of death in the world [1,2]. Every year 1.24 million. people die on the world's roads and between 20 and 50 million. people are injured, making road tra c accidents the eighth cause of death globally [3].
Studies argue that around 10% of road tra c injuries take place when the driver is traveling in the course of work; while a further 18% of injuries take place while a driver is traveling to or from work, i.e., commuting [4]. A bus driver injured in the course of driving for work would be seen as such. A cross Europe, it is estimated that 40% of tra c injuries happen during commute or business journeys [5]. In Spain, in this regard, in 2016 there were 566,235 injuries associated with traveling to/from work in which 64,737 cases were road tra c injuries occurring during business or work travel, accounting for more than 11.4% of the total [6].
In this study the journey purpose is classi ed into three groups: "in itinere" refers to commute-journeys or travel from home to work and vice versa, on business refers to when a driver travels for work-related purposes, and in leisure refers to when a driver travels for pleasure. e variables that in uence the occurrence of a tra c injury can be divided and de ned in four groups: demographic factor, human factor, vehicle factor, and circumstantial factor. Nevertheless, the focus of the current research is mainly on the human factor. Human factor has been considered the main cause of tra c injuries as highlighted by Mazankova [7].
Sabey and Taylor [8] suggest that the behavior that the driver adopts in front of the steering wheel has become an important factor in the principal cause of promoting tra c injuries. For that reason, several theories have been developed in which they have explained possible risk behaviors behind the steering wheel. One of these theories is denominated "the zero-risk theory" which discusses the existence of a risk threshold above which the danger is not perceived [9]. is theory considers that the reasons and emotions play an important role in the driver behavior.
If we extrapolate "the zero-risk theory" as suggested by Salminen and Lähdeniemi [10] to the traffic that occurs during the workday, some of these reasons could be argued as the time pressure, work pressure as well as excessive workload and tiredness.
e time saving is argued as a main reason by Summala [9], which can trigger an increase in speed (assuming higher risk) to meet the objective (arrive early). Tiredness behind the wheel is one of the risk factors that has been highlighted by Bener, Yildirim [11]. Driving throughout long periods of time without rest phases makes driving a monotonous task, reducing the ability of driver to drive safely until dangerous limits [12]. Kim and Chung [13] explain the role of job satisfaction in relation to the number of traffic accidents, and Wishart, Somoray [14] suggest strategies should be developed in order to encourage positive work driving safety climate at work. Finally, issues associated with family or work conflicts which in the majority of the cases result in biological imbalances also o en trigger a reduction in resting hours as well as drowsiness and subsequently add to risk factors [10,[15][16][17][18][19].
Another relevant variable presented in different campaigns of the Direción General de Tráfico is the lack of using appropriate safety accessories. Many authors consider the lack of using helmet or seat belt as the main risk factors in work-related injuries [20,21].
Several authors have taken into account the gender, age, labor sector, and economic remuneration received in order to identify which population groups are most propense to suffering a traffic injury [15,22].
Regarding gender, studies argue that male gender is more involved in injuries than the female gender. e main reason for such conclusion is that sectors with higher frequency indices are the transport and distribution sectors that are generally run by men. However, these studies highlight that women suffer more work-related traffic injuries during their displacement than men [23].
Studies conducted in relation to age factor demonstrate that young drivers overestimate their driving abilities, using risk maneuvers [25]. e aging process involves the biological and psychological system deterioration, and it is considered that it starts around 45-50 year old. From the point of view of driving, this loss is focused on the sense of sight, slowing down the speed of perception and response to stimuli and the reduction of muscle strength [26,27].
To this end, we can conclude that speed is one of the most influential behaviors of the driver that causes fatal injuries [28,29]. Little increases in speed highly increment the risk of an injury and the severity of the injury [30]. An increased speed means a greater kinetic energy; therefore, in the case of an impact, this energy is absorbed by the vehicle, its passengers, and the element against which it interacts, encoring the number and the severity of injuries. A driver traveling at a high speed lengthens the reaction distance, defined as the distance traveled by the vehicle before the driver reacts to a danger. e pressure of arriving to work on time can cause some reckless and careless manners of drivers such as reaching high speed, which could result in more injury-prone in the roads [15,21,31,32].
To conclude, the key point of this study is to establish a probabilistic model based on Bayesian networks. Such analysis was conducted in order to predict the risk of suffering an injury in function of displacement reasons: whether "in itinere", on business, or in leisure trips and others. e model narrows down its focus on four groups of factors including demographic factors, vehicle factors, circumstantial factors, and human factors. us, the model determines those drivers' behaviors that entail a greater risk of suffering an injury. erefore, research directly focusing on a systematic relationship between the journey purpose and harmfulness of drivers while taking into account these four groups of factors in road traffic injuries in a Spanish context remains limited in the field. To this end, the justification behind conducting this research was to address this gap in the field and aims to add to the existing knowledge as well as the literature around the topic.

Data Base Acquisition.
e data base used to develop this study has been provided by the Dirección General de Tráfico (DGT). Institution on charge to register the traffic injuries in Spain.
In Spain, when a traffic accident occurs, the agents of the authority in charge from the surveillance and control of traffic, within the scope of their respective competences, send the information related to traffic accidents to the National Registry of Victims of Traffic Accidents. is information includes the information concerning the traffic accidents with victims, and through the form, this information is included in the annex of the official document BOE-A-2014-12411 [33]. e microdata set used in this study has three tables: general table, vehicles table, and drivers table, which gather information about the traffic injuries that happened in 2016. In that year, 102,362 injuries took place in which 172,971 drivers were implicated [34]. is research specifically focuses on those drivers whom harmfulness is known, and at the same time the study focuses on the type of their known displacement. e degree of severity of such drivers has been defined as: fatal (FI), seriously injured (SI), lightly injured (LI), and unhurt (U). ese drivers are registered by traffic police as drivers who were taking a journey either to go to work from home or vice versa to home from work. ey also could be registered as drivers who were driving for work purposes or driving was their job. Finally, they could be registered as traveling for leisure and pleasure purposes. Taking this harmfulness of the driver and cause of displacement aspects into account, the final dataset includes a total of 66,253 drivers.
To this end, the sampling technique employed in this study is a systematic sampling method. e authors have excluded the data for traffic accidents in 2016 in which the purpose of the journey and the driver harmfulness were not reported by (DGT). Utilizing data from the sample population collected by (DGT) and employing a Bayesian network, the current study focuses on four relevant variables and discusses results in which the study highlights the importance of relationship between drivers' behaviors in road tra c injuries with the level of drivers' harmfulness.

Study Variables.
e variables that contribute to the occurrence of a tra c injury and result in driver harmfulness can be assembled into four groups: demographic factors, vehicle factors, circumstantial factors, and human factors.
Each of these factors in turn includes a series of variables, with their corresponding states.
(i) Demographic factors: combination of the gender and the age of the driver. (ii) Vehicle factors: type of vehicles. (iii) Circumstantial factors: type of trips or reasons for displacements, type of roads or zones and distance or kilometers of travel. (iv) Human factors: the behavioral factors, or modi able factors by the driver. ese could include wearing a seat belt, wearing a helmet, the speed violation as well as distraction and errors made by the driver. (v) Study variable: driver harmfulness represents driver injury severity.
is study focuses mainly on the human factor, being considered as the principal cause of tra c injuries (between 70% and 90%) [35].

Bayesian Network.
In order to characterize the dependences between the di erent factors and the target variable, the probabilistic graphical models (PGMs) have been considered. Several studies have previously employed Bayesian network in their analysis of tra c accidents to express certain relationships between the di erent factors [36][37][38][39]. ese models are based on a graph in which each node represents a variable or factor and each link between variables represents a dependence between them. ese dependences/independences let us to factorize the joint probability distribution (JPD), which is the second element of these models, dramatically reducing the number of parameters of our model and, as a result, simplifying the learning and inference processes. In addition, the graph obtained is a visual and easily interpretable tool to illustrate the factors a ecting our target variable. In particular, in our study, we have considered the discrete Bayesian networks [40] in which the graph of the model is a directed acyclic graph (DAG). e link's direction introduces two additional concepts in the nodes of our model, parents and children, depending on whether the arrow departs or points to the node, respectively. As a result, the JPD can be expressed mathematically as where corresponds to the parents of , being the BN the model de ned by both the DAG and the corresponding JPD in Equation (1). Once the Bayesian network has been de ned, the probability of any node or set of nodes given any information on the state of the others variables (evidence) can be e ciently obtained by using both the factorization and the DAG (inference), letting us to analyze the impact of each of the variables in the injury severity grade su ered by the driver. As an example, we could have some evidence about the motive of the displacement, the age, and the gender with which we can determine the probability of a serious injury in the accident by means of the expression: Moreover, from the de nition of the Bayesian network, a natural classi er for the injury severity can be obtained de ning a threshold for the probability above/below of which serious/ no injury is assigned. To evaluate this classi er, the receiver operating characteristic (ROC) curve was considered. is technique was introduced in the clinical investigation by two radiologists which allow us to represent the true positives (sensitivity) based on false positives (speci city) [41]. e area enclosed under the curve (AUC) allows to evaluate the model. is area can take values between 0 (perfect predictor of the contrary state) and 1 (perfect predictor), corresponding the 0.5 value to a random prediction (unreliable model).

eory Model.
e proposed model can be appreciated in Figure 1.
Below is the list of the factors and the interactive variables with their de nitions that contributed to the development of our model (please see Figure 2. eory model).
e factor vehicle refers to the type of vehicle variable that has been discretized in six groups: cars, bikes, motorcycles, buses and coaches, trucks and others. e demographic factor included two types of variables: age and gender. e variable "gender" remains the same as mentioned in the questionnaire. However, the variable "age" has been grouped into four groups: less than 18, 18 to 24, 25 to 60, and over 60.
e human factor has been grouped in ve types of variables: seat belt, helmet, speed, distraction, and error. e variables seat belt and helmet indicate if the driver was using such safety accessories in the moment of the accident. e speed variable has the same four states as shown in the questionnaire; the rst state is "none" and indicates that the driver was driving in the correct speed, the second group indicates if the speed was inadequate, the third state shows when the driver was driving over the limit speed allowed, and the fourth state indicates if driver was driving the vehicle too slow-below the standards. Finally, the group of the variables' errors and Journal of Advanced Transportation 4 behavior, vehicle factors, road geometric characteristics, and environmental factors [42].
is is important; however, the result of the analysis conducted in this study employing the Bayesian network (presented in Figure 1) gives us information on how all these variables are interrelated with each other (Figure 3).

Probability of Serious Tra c Accident Based on the Cause of Displacement and the Behavior of the Driver.
To analyze the in uence of displacement reason in the harmfulness in the accident, a sensitivity analysis has been done to establish driver's harmfulness probability in function of two evidences (see Table 2). e rst evidence, in all analysis carried out, is always the type of trip, and the second evidence is in relation to one of these human behavior variables: seat belt, helmet, speed, distraction, and errors.
Not wearing safety accessories including seat belt and helmet results in serious accidents. at reaches levels of 19.9% and 13.0%, respectively. Focusing on the type of displacement, not wearing a seat belt, "in itinere, " on business, and leisure, the gures are shown as 19.1%, 17.4%, and 19.3%, respectively. A cross-tabulation test illustrated in Table 1 examined the relationship between speed variables and type of trip. e test was statistically signi cant and illustrated 0.202, the worst gure, suggesting that there is a highly signi cant relationship between "exceeding speed" and "travel for leisure purpose" for serious injuries. Speci cally, if we keep our focus on the type of trip, exceeding speed on pleasure trips is the factor that mostly determines the probability of su ering a serious and/ or fatal accident. ese probabilities account for 17.5% in "in itinere" trips, 15.9% in on business trips, and 20.2% in leisure trips. e possible distractions made by the drivers are using mobile phones, focusing on GPS devices, being distracted by radio and music, smoking while driving, and some other types of distractions. On the other hand, the errors made by the drivers are related to making mistakes in terms of not paying enough attention to tra c signs, to other vehicles, to the pedestrians, and so on. e results show the probability of death or major injury in an accident based on these variables and the reasons for displacement.
Distractions and errors, provoke very similar probabilities of su ering a serious and/or a fatal accident. is fact was also veri ed in several studies as mentioned by Cordazzo, Scialfa [43]. However, if we analyze the probability of distraction and distraction indicates that the driver did not make any error or distraction. On the other hand, the state "yes" indicates the contrary. All the errors and distractions included in the analysis are shown in the section of comments in Table 1.
e circumstantial factor has been grouped into two types of variables: zone and type of trip. e variable type of road or zone remains in the same four types as in the questionnaire lled by the police (road, crossing, street, and highway). e variable type of trip shows the cause of displacement in three groups including "in itinere, " on business and in leisure. Another variable taken into consideration is the distance that the driver undertakes. e variable shows the same three groups with the answers that drivers answered the transport policemen. e distances are categorized as: local (less than 50 km), medium (between 50 km and 200 km), and long distance (more than 200 km).
Lastly, the objective variable, is the object of the study, injury severity, has been created considering the severity of the driver's injury. is variable has two values: rstly "light" if the driver was slightly injured, and secondly kill serious injuries "KSI" if the driver was either fatally injured or seriously injured. It would be worth mentioning here that this study focuses merely on injury severity for the driver himself or herself.

Validation. A k-fold cross-validation approach, with
= 10, was considered to valuate the model. is method divides the data into 10 folds including the 10% of the sample (i.e., ~6625 data for each fold). For each fold the other 90% of the sample (~59628 data) is used as training data to predict the sample included in the corresponding fold, used as test data. is procedure was performed ten times in order to ensure that all data with no exceptions were calculated since it has been part of the training and testing analysis. e area under the curve indicates the ability to determine the probability of su ering whether it was a major, a fatal, a minor, or an unharmed injury. In this case, the AUC is in a range of (0.767-0.801).

Initial Probabilities of Serious Injury in a Tra c Injury.
A sensitivity analysis of each of the variables carried out in this study is to determine the initial probability of death or major injury for the drivers versus minor injury in each of its states. A sensitivity analysis of each of the variables is also carried out to determine the initial probability of death or serious injury (KSI risk) for the drivers versus slight injury in each of its states. e results are shown in Table 1.
A er carrying out the sensitivity analyses, showing the initial probabilities, we can argue that the most in uential variables are respectively as follows: the type of vehicle, distance, age, seat belt, and, nally, speed.
It is important to take into account the interrelation that may exist between the di erent variables. erefore, the Bayesian network presents their strong point for their ability to extract knowledge through the search of the joint probabilities of all the variables among themselves. e factors that contribute to the severity of accidents are related to each other and do not act on merely. As a result, the accident occurs with complex interactions between road user mistakes due to distractions. is is quite evident in leisure trips, reaching a probability of 42.4% of making an error. Likewise, the probability to make a mistake due to distraction is bigger in leisure trips reaching a probability of 17.8%. is suggests that the reasons highlighted earlier resonate well with these ndings.
On the other hand, business trip is less probable to result in high-risk accidents. is is in keeping with the work of de Oliveira, Petroianu [44] who explain how recklessness while driving a motorcycle could be argued as the main cause of tra c accidents. In their study only 7% of displacement with motorcycles was for work. Our analysis shows similar results as motorcycles are used on business with a probability of 8.2%. error depending on the type of trip, the results suggest that the highest chances of su ering a serious and/or a fatal accident occur in leisure trips (7.5% and 6.7%). is might be due to the fact that drivers on leisure trips may drive on the routes that are less familiar with and they are likely to be more distracted by other factors such as talking to their fellow travelers. Table 3 shows the relative probability of having a distraction or errors in function of the purpose of displacement.
Depending on the type of trips taken by drivers, di erent probabilities of getting distracted or making mistakes during the trip are shown in Table 3. As the table illustrates, the probability to make a mistake due to the errors is higher than Use of mobile phone, use of hand-free devices, use of GPS devices, radio or music on, watching DVD, or video device, wearing headphones, smoking, simultaneous driving activities (eating, drinking, finding objects…,), interacting with other occupants, distracted by a previous accident, looking at the environment (landscape, advertising, signs...,), being lost in thought or absent minded, sleep, fatigue, sudden illness, indisposition.

Probability of a Serious
Traffic Accident Based on the Reason for Displacement, Age and Gender. A sensitivity analysis examined the age and gender of the driver in relation to the cause for displacement. e analysis presents the probability of having a serious or fatal accident. Table 5 illustrates these analyses.
As in previous studies, the results confirm how men are more likely to have more serious accidents on the road [47]. On the other hand, focusing on the reason for the displacement, the test revealed that there is a big difference in KSI risk in function of the age and gender in relation to the cause of displacement. First, as the figures show man drivers over 25 year old have the highest probabilities of having a serious accident in leisure trips (8.1% in leisure in comparison with 5.5% for business). Looking at the table, however, it becomes apparent that, young drivers (less than 18 year old), regardless of their gender whether a woman or man, reach the highest probabilities on business trips (20.1% for men and 18.7% for women). According to Korpinen and Paakkonen [48], younger people tend to have more accidents while on their mobile phones (distraction, in our study).

Conclusions
In the year 2016, in Spain, 177,356 vehicles and 172,972 drivers were involved in traffic accidents resulting in 102,362 traffic injuries. e focus of this study was, therefore, on drivers who were injured in traffic accidents, and the focus shi ed on their harmfulness in relation to the type of trips they were undertaken. erefore, the dataset for this study includes a total number of 66,253 drivers.
According to the dataset, out of the 66,253 initial drivers involved in a traffic accident, only 4,542 were seriously injured (6.8%). According to the analysis carried out in this study, the high probability to suffer a serious injury in leisure purpose was (7.3%), "in itinere" (6.5%), and on business (5.8%). Based on these results, it can be argued that there is in general a greater probability of having accidents in leisure trips.
ese data resemble with the article published by Mitchell, Bambach [49], where the authors conclude that factors such as alcohol, speed and fatigue are less likely to be involved in accidents when they are associated with business issues.
e main risk factors involved in road traffic injuries were, respectively, driving a motorcycle (21.3%), not wearing a seat belt (19.9%), exceeding speed limit (19.0%), drivers under 18 year old (18.1%), not wearing a helmet (13.0%), while crossing 3.5. Probability of Serious Traffic Accident Based on the Cause of Displacement and the Type of Vehicle, Zone and Distance. To analyze the probability of KSI risk, a sensitivity analysis has been conducted to establish the probability in function of two evidences (see Table 4). In our study, based on our findings, we argue that where a car driver experiences an injury, in 3.8% of cases, the injury is whether serious or mortal. e risk is somewhat higher for truck drivers, at 4.7%. e other two vehicular modes with elevated KSI risk are, respectively, cycles, at 14.6%, and motorcycles, at 21.3%. However, focusing on the type of trips, it is important to emphasize that 14.8% of the serious and/or fatal accidents occur in displacement for pleasure when the vehicle used is a bicycle. Concerning motorcycles, the "in itinere, " on business, and leisure displacements, the figures show higher probability of suffering a serious and/or a fatal accident demonstrating 20.1%, 21.6%, and 20.2%, respectively.
In general, the risks of suffering a serious and/or a fatal accident for road users are less harmful when they travel on urban areas as Olszewski, Szagala [45] mentioned in their study. Table 4 confirms the figure 3.9% of having a serious accident on street, especially on business displacements, which reaches 2.7%. In contrast to this, the analysis shown in Table 4 confirms that the highest probability of suffering a serious and/or a fatal accident occurs on the motorways with 10% in leisure journeys. e last variable in Table 4 is the distance. It can be noted that among the local displacements, medium and long range, are mid-range displacements that cause more risk to drivers (8.2%). Within these medium-range trips, pleasure trips are the most dangerous trips, reaching the figure 10.3%, in comparison with local displacement for business travels that shows the figure 4.2%.
Leisure trips, as presented in the table, encompass the higher risk of suffering a serious and/or a fatal accident in trips in comparison with the others. ese results are consistent with the findings by Bellos et al. (2019). In their article, Bellos et al. explain that the risk of suffering accidents in general is increased with the tourists who drive during holiday periods, those who are obviously doing leisure trips. e article highlights that this may be due to the increase in vehicles during the tourist season and also because tourists do not know the city nor its traffic regulations or signage [46]. e data presented in Table 4 demonstrates an elevated severity risk for road users, those who are involved in leisure-related "in itinere" journeys. As the table indicates, the severity risk for road user leisure travelers is at 9.5%, indicating a higher frequency than the other trip purposes. Failing to see a road sign, failing to see a vehicle/pedestrian/obstacle, not understanding a road sign or contusing it, hesitation or delay in making a decision, incorrect execution of a maneuvers or inadequate maneuver, forgetting to signalise (with the vehicle indicators or lights…) Unknown 16438 0.084 particular affected on business trips with a (20.1%) of probability. is is while young women drivers represent the (18.7%) of probability. Another important factor concerning the gender and age is that the lower risk of suffering serious and/or a fatal accident occurs in the age range above 60 year old. Another important factor that is considered in the current study is the behavior of the driver. e main risk factor associated with driver's behavior in relation to displacement is not wearing a seat belt, in the case of "in itinere" displacement representing (19.1%) and for business trips (17.4%). Exceeding the speed limit is another factor associated with driver's behavior in the case of displacement reaching (20.2%) in leisure trips. In relation to the driver's behavior, it is further observed that the probability of having a serious and/or a fatal accident due to making mistakes or being distracted is not so high. e result road (10.2%), driving a medium distance (8.2%), being distracted (6.9%), and finally making a mistake (6.0%).
e findings of the current study, according to the type of vehicle, suggest that motorcycles account for a probability of 21.3% of suffering a serious or a fatal. ese results are consistent with the results of the study conducted by de Oliveira, Petroianu [44], and these authors argue that recklessness of motorcyclists while driving is the main cause of traffic accidents. Also, the authors emphasize that besides motorcycles, bicycle cyclists have a high probability of suffering a serious accident, reaching (14.0%) and coinciding with the result of our study (14.6%).
Regarding the sex and age of the driver, the masculine gender is the sex with greater probability to suffer a serious injury and/or a fatal one. Young man drivers (<18) are in shows an average of (5.7%) for mistakes and (6.4%) for distractions. Nevertheless, the probability of committing an error or distraction during driving is high, which reaches an average of (38.99%) for mistakes and (14.6%) for distractions. Finally, a sensibility analysis was conducted in order to identify the probability of serious accidents as to what extent they determine the cause of displacement, the zone as well as distance. e higher probabilities of suffering a serious and/ or a fatal accident according to the zone are in leisure trips in motorway (10%), on business and "in itinere" in crossing areas (8.1% and 9.4% respectively). is is while displacements caused by driving long distance reach (6.8%) "in itinere" trips and (6.6%) on business trips, on the other hand, in the case of leisure trips the high probability occurrence in medium distance reaching (10.3%).

Conflicts of Interest
e authors declare that they have no conflicts of interest.