Predicting the Collisions of Heavy Vehicle Drivers in Iran by Investigating the Effective Human Factors

Traffic collisions are one of the most important challenges threatening the general health of the world. Iran’s crash statistics demonstrate that approximately 16,500 people lose their lives every year due to road collisions. According to the traffic police of Iran, heavy vehicles (including trailers, trucks, and panel trucks) contributed to 20.5% of the fatal road traffic collisions in the year 2013.(is highlights the need for devoting special attention to heavy vehicle drivers to further explore their driving characteristics. In this research, the effect of heavy vehicle drivers’ behavior on at-fault collisions over three years has been investigated with an innovative approach of structural equation modeling (SEM) and Bayesian Network (BN). (e database utilized in this research was collected using a questionnaire. For this purpose, 474 heavy vehicle drivers have been questioned in the Parviz Khan Border Market, located on the border of Iran and Iraq. (e response rate of the survey was 80%. (e participants answered the questions on Driver Behavior Questionnaire (DBQ) and a sleep assessing questionnaire named Global Dissatisfaction with Sleep (GSD). In this research, human factors affecting at-fault collisions of heavy vehicles were identified and their relationships with other variables were determined using the SEM approach. (en the descriptive model constructed by the SEM method was used as the basis of the BN, and the conditional probabilities of each node in the BN were calculated by the database collected by the field survey. SEM indicates that other attributes including GSD, mobile usage, daily fatigue, exposure, and education level have an indirect relation with heavy vehicle drivers’ at-fault collisions. According to the BN, if there is no information about the characteristics of a heavy vehicle driver, the driver will likely have at least one collision during the next three years with the probability of 0.17. Also, it was indicated that the minimum probability of the at-fault collision occurrence for a heavy vehicle is 0.08.


Introduction
Traffic collisions have become one of the most important challenges threatening the general health of the world [1]. According to the World Health Organization's (WHO) report, about 1.24 million people were killed on the roads' collision in the year 2010, which is approximately 3,400 people per day [2]. In a global status report on road safety 2018, the World Health Organization has predicted that 16,426 people had died due to traffic collisions in Iran in the year 2016 [3]. On the other hand, heavy vehicles (such as trailers, trucks, and panel trucks) constitute 8.3% of the total number of vehicles in Iran [4], while these vehicles were present in 20.5% of the fatal road traffic collisions in 2013.
Given the high contribution of heavy vehicles to fatal road traffic collisions and the much smaller number of these vehicles compared to the total number of vehicles in Iran, the collision risk of this type of vehicle is way higher than that of cars that has led to a much higher drop in this category of cars than on riding cars, indicating the necessity of conducting a separate study on this category of drivers.
One of the first steps to study road traffic collisions is the recognition of factors affecting the occurrence of these collisions. Researchers divide the factors affecting the collisions into four general categories: (1) Human factors (2) Environmental factors (

3) Vehicle factors (4) Road factors
Studies have demonstrated that human factors were effective in the occurrence of 93% of collisions, while environmental factors and vehicle factors were found to be effective in 34% and 13%, respectively [1]. is highlights the importance of the role of the human factors in the occurrence of traffic collisions and the need for special attention to this factor. is research identifies the effective human factors in atfault collisions of heavy vehicle drivers with an innovative approach of the structural equation modeling (SEM) and Bayesian Network (BN). For this purpose, SEM was used to explore the relationships between the effective human factors on collisions.
e innovations of this study can be divided into the following categories: the construction of the BN requires cause and effect relationships among the variables effective in the study. Although in most studies conducted on the traffic collisions using the BN, the BN was usually based on the researcher's expert opinion [5]; in this study, first, the variables affecting the incidence of traffic collisions and the way they influence the traffic collision were identified using the SEM method, then this model was employed to define the BN. e SEM model is only a descriptive model and has no prediction capability, but with the integration of SEM with the BN, this method gains the prediction capability. Considering the widespread application of SEM in behavioral sciences, the integration of these two methods would be extremely appealing for the prediction of human behavior.
Finally, the goal of this research can be divided into the following parts: first, with the help of the SEM, human factors affecting the incidence of at-fault collisions of heavy vehicle drivers were identified and then determined how they affect the collision occurrence. In the next step, the relationships discovered by the SEM were used in the construction of the BN and then the probabilities used in the BN were calculated according to the collected data.

Literature Review on Human Factors Affecting the Incidence of Traffic Collisions.
Researchers have demonstrated that driving behavior is one of the most important factors affecting the crash risk factor. So far, numerous questionnaires have been developed to study driving behavior, but the most popular one is the Driver Behavior Questionnaire (DBQ).
is questionnaire is a tool for identifying and categorizing aberrant driving behavior. Aberrant driving behavior is divided into four general categories: driving slips (lapses), errors, ordinary violations, and aggressive violations [1]. In this research, we will identify which of these factors is effective in the incidence of at-fault collisions.
Researchers have indicated that inappropriate driving behavior is one of the most important factors influencing the Crash Risk Factor [1]. Inappropriate driving behavior can be subdivided into three general categories which are slips (lapses), errors, and driving violations. Driving slips are generally memory-related problems, insufficiencies, and neglects, for instance when a driver enters an intersection or the fourth lane incorrectly. Errors in driving are mistakes that result from a misjudgment and an inability to see, such as when a driver is turning right and suddenly encounters a cyclist or a motorcyclist with a high risk of impact. Driving violations include failing to comply with the necessary laws that are the absolute requirements of safe driving.
In the studies that have been carried out on the variable of inappropriate driving so far, the effect of this variable on the number of road collisions has provided different results. For example, in a study that focused on elderly drivers, only the driving errors factor had a meaningful correlation with the total number of collisions [6], but in another study that focused on bus drivers, the driving violations variable had a meaningful correlation with the total number of crashes [7]. Also, other studies showed differences among work-related variables between urban bus drivers and bus rapid transit (BRT) drivers that illuminate the importance of studying each segment of drivers separately [8]. In some studies, only one of the items on the questionnaire was related to the total number of collisions. For example, in a study on the driving behavior of truck drivers, it was indicated that the increase in the number of high driving speed violations increased the chance of traffic collisions [9]. In a study conducted on the general Iranian drivers' population, it was noted that out of the three inappropriate driving factors including the slips, errors, and driving violations, only the driving offenses factor was associated with the total number of driving incidents [10]. In another study, which introduced a new classification of violations including risky violations and highway violations, there was a direct relationship between highway violations and driving errors and the total number of traffic incidents [11]. Tabibi's study on predicting drivers' inappropriate behaviors concluded that the factors of driving errors and driving offenses are directly related to the number of driving collisions reported by the interviewees. Also, the driving error factor has a much stronger relationship with the total number of collisions compared to the driving violation factor [12]. But in most studies conducted in countries other than Iran, the variable of driving offenses is more associated with the total number of collisions compared to the driving error variable [13]. e reason behind this difference may be the high tendency of Iranian drivers towards high-risk driving, such as chasing other cars with very small spacing in between, and in these situations, driving errors can be the most important cause of traffic collisions [12].
Other important human factors affecting the occurrence of collisions are drivers' sleep quality and daily fatigue. ese variables have great importance for drivers of heavy vehicles owing to their continuous driving. However, the official statistics related to car collisions that are caused by drowsiness are often not collected [14]. e few statistics that consider drowsiness as a factor that affects the total number of driving collisions show a very low rate (1% to 3% of the total collisions), while self-proclaimed reports by drivers often indicate a very high percentage of drowsy driving and collisions caused by driver drowsiness [15]. e results of a study that focused on the effect of drowsiness while driving on traffic collisions in the United Kingdom showed that 29% of male drivers had the experience of a nap while driving during the past 12 months. Also, 7% of drivers who have had a driving collision within the past three years indicated tiredness as the main reason behind the incident [16].
e Case Studies of the National Transportation Safety Board on collisions found that collisions related to drowsiness and sleeps are extremely widespread among truck drivers. In 1995, the National Transportation Safety Board conducted a study on single-vehicle collisions (in which the drivers survived) and found that 58% of the collisions were caused by fatigue and 19 out of the 107 drivers (17%) interviewed in the study reported that they had been asleep at the time of the collision [17]. erefore, this factor has also been considered as one of the important factors in this study.
Drivers' use of mobile phones is becoming more common [18]. So far, many studies have been carried out to identify the effect of mobile phone use on safe driving. Brown et al. found that using mobile phones while driving reduces speed and increases the number of judgmental errors made by drivers [19]. Other studies have indicated that using mobile phones while driving reduces drivers' performance [20]. erefore, as an important and effective factor, the use of mobile phones has also been considered in this research.
Socioeconomic characteristics of drivers such as gender, age, urbanization, education, occupation, working hours, marital status, and nationality have been measured almost in all of the human factor studies. e age variable had a decreasing effect in all the studies that assessed the relationship of this variable with the total number of collisions in a certain period [9,10,21,22]. In another study conducted on bus drivers in Iran, the increase of the age indicator had no significant relationship with drivers' collisions [11]. In one study, three reasons for the increased likelihood of a collision involving younger drivers have been presented. First, young drivers drive more miles per year. Second, they generally drive faster than older drivers, and finally, those young drivers commit more hostile driving violations than older drivers [9,23].
Also, in some of the conducted studies, the variable of educational status is considered as a feature that determines a person's social stance [10,22]. e results of the study conducted by Moghaddam and Ayati in 2014 showed that the higher a driver's education level is, the lower the number of their collisions becomes [10]. But the results of Factor's study showed no correlation between the educational level and the total number of a driver's collisions [22].
In summary, six effective variables have been identified by studying human factors influencing traffic collisions. ese variables include driving behavior, daily fatigue, sleep quality and health, mobile phone usage, socioeconomic characteristics, and driving experience. In this study, these variables have been used to explore the relationship between human factor variables and traffic collisions.

Literature Review on Structural Equation Modeling
(SEM). Utilizing structural equation modeling in the field of transportation safety does not have a prolonged history. is section reviews some of the studies carried out in this field. In a study conducted on elderly drivers, Lucidi et al. used structural equation modeling. ey assessed the relationship between personal characteristics variables, habits related to traffic safety, and dangerous driving (violations, errors, slips, and the total number of driving tickets received in the past 12 months). Structural equation modeling results showed that a driver's characteristics can, directly and indirectly, predict the dangerous driving variable [6].
In another study that was conducted using the structural equation modeling approach, the relationship between the risk of collisions and the inappropriate driving habits of 301 Italian bus drivers was investigated. is study showed that there is only a relationship between the variable of traffic offenses and the risk of collisions, meaning that the more traffic offenses bus drivers have, the greater their risk of collisions [7].
Eboli and Mazzulla also used the structural equation modeling to analyze traffic collisions in Italy [24]. ey considered the severity of a traffic collision as a hidden variable, measured by two variables that are the number of casualties and the total number of vehicles involved in the collision. In the model developed in this study, the dependent crash severity variable was affected by three independent variables that are the environmental condition's factor, the path specifications factor, and the drivers' specifications factor. In the final model, the total number of casualties is defined as the most important indicator of collision severity. Also, it was found that the road classification variable is the most effective variable representing path characteristics, and the climate conditions variable is the most effective environmental factor that indirectly affects the severity of traffic collisions [25].

Literature Review on Bayesian Network (BN).
Numerous scholars have investigated the modeling of drivers' behavior. ese studies have devoted most of their emphasis on the typical driving conditions. ere are two modeling routines to follow: the performance model and the cognitive model [26]. e latter approach represents and defines the different inherent processes of driving behavior. e techniques applied in this approach are the Adaptive Control of ought-Rational (ACT-R) and the multiagent systems. However, the first approach concentrates on the driver's actions reproduction for a special task during a certain condition [26]. e major techniques applied for the same objective include neural networks, transfer systems, fuzzy logic, Markov chains, etc. [26].
e outputs of the model are categorized as samples, and the target is the anticipation of the most suitable class output for each of the drivers. e technique satisfying these requirements is Bayesian networks in architecture with an augmented naïve nature.
Chen et al. in a comprehensive study developed a hybrid approach for combining BN approaches and multinomial logit models to analyze the severities of drivers' injuries in rear-end crashes based on crash data compiled in New Mexico for the 2010-2011 period. To identify and investigate the major factors contributing to the severities of rear-end crash driver injury, a multinomial logit model is developed [27]. e identified major factors are then used to develop a BN for formulating statistical relations between injury severity implications and explanatory features, such as demographic specifications, driver's behavior, environmental and geometric features, and vehicle factors.
Both efficiency and safety are typically deemed as two major performance indices of transport systems. e planning of road networks has concentrated on transportation efficiency and road capacity; however, a road network safety level has not received much attention in the planning phase. Another research presents a joint model of Bayesian hierarchical nature for evaluation of road network safety to assist planners to take the safety of traffic into account at the time of road network planning [28]. e presented model develops relations between the risk of the road network and microlevel variables associated with traffic volume and road entities, and also trip generation, network density, and socioeconomic variables commonly utilized for long-term transportation schemes [29].
As the application of in-vehicle information systems (IVISs), including navigation systems and mobile phones, is continually increased, driver distraction has become a major safety concern. An approach that allows individuals to benefit from in-vehicle information systems is creating adaptive in-vehicle information systems adjusting their operations per driver and roadway state [29]. A crucial element in adaptive in-vehicle information systems is the real-time monitoring of driver distraction. Such a monitoring function makes it possible to reduce that distraction. Bayesian networks were used in this study to establish a realtime procedure to detect cognitive distractions through driving performance and drivers' eye movements [29]. In another study, a Bayesian regression procedure has been taken for developing travel time prediction equations for central region streets through intuitive contributory variables [30].

Materials and Methods
In this part of the report, the overall trend of the modeling has been studied. First, general information about the database is provided. In the next step, the method of SEM construction is described and the quality of the SEM is evaluated.
en, the methods of BN construction and probability tables' preparation are discussed.

Data Collection.
In this study, standardized questionnaires such as Driver Behavior Questionnaire (DBQ) and Global Dissatisfaction with Sleep (GSD) were used to collect data [1,31]. First, the English versions of these questionnaires were translated into Persian. en, a pilot survey was conducted involving a few heavy vehicle drivers, and the ambiguities of the questionnaires were modified. Eventually, the final questionnaire language was checked out by a language specialist for resolving possible inconsistencies between the English and translated versions. e English version of the questionnaire used in this study could be found in the Supplementary Materials (available here).
Data gathering and interviewing were carried out in Parviz Khan Border Market, Qasr-e Shirin County, Kermanshah Province in Iran. e Parviz Khan Border Market is located on the zero-point border between Iran and Iraq. A total of 474 truck drivers participated in this survey. All the participants were male (there is less than 5 female heavy vehicle drivers in Iran because of cultural issues). e average age of the participants was 44.1 (SD � 9.8). About 20% of the participants had a college education and 88% of them were married. Data gathering took over 20 days, from March 12, 2016, to April 8, 2016. It is worth noting that the data collection was stopped from March 17, 2016, until March 24, 2016, because of the closure of the border market. Truck drivers completed the questionnaire in the parking lot of the border market from 10 A.M. to 5 P.M. e truck drivers were informed that participating in this survey is completely voluntary and would remain anonymous. Before making the modeling for constructing a database, all data were preprocessed.

Structural Equation Modeling (SEM)
. SEM as a generalized statistical method has had a wide range of applications in behavioral sciences, especially in sociology and economics since the 1970s. After that, the application of this method has expanded to other sciences including psychology, political science, and educational sciences. All of the multivariate methods, including multiple regression and principal components, provide researchers with tools to examine a wide range of theoretical problems. But all of these methods have a common weakness: each technique at a time can only examine a single correlation relationship. Despite other multivariate analysis methods, SEM simultaneously investigates a series of interdependent relationships that can be considered as a set of multiple regression equations [32].
An SEM model is composed of two structural and measurement patterns. e structural pattern, a set of dependency relationships, interconnects structures that exist in the pattern, and the measurement pattern, a part of the overall model, identifies the measurement indicators of each latent variable. A latent variable cannot be measured directly, but rather must be measured by several other variables (indicators) [32]. For instance, the number of errors in the driving of a person cannot be verified, but rather there must be a variety of questions to identify the various aspects of driving errors and measured the theory using them. Figure 1 depicts the structural and measurement patterns. In Figure 1, circles represent latent variables and rectangles are questionnaire questions.

Bayesian
Network. BN has been introduced for more than three decades. BN is a directed acyclic graph (DAG) that shows a set of random variables and their independent connections. BN is formed by a set of variables V � x 1 · x 2 · · · · · x ϑ provided that ϑ > 1.
e DAG is used to represent meaningful relationships between variables. A set of probability tables in the form of B p � p(x i | pr(x i )) · x i ∈ V exists to show the occurrence probability of each node. In this relationship, pr(x i ) is the set of parent variables of x i in B p . e probability of each of the variables displayed in the nodes is calculated using equations (1) and (2) [27]: To construct the BN, the specified relationships between data must first be determined, and then the probability table for each node is defined. For this purpose, Rapidminer software was employed to calculate the number of observations of each variable. is software is extremely useful in database and data mining studies. GeNIe 2.1 software was used for BN calculations. GeNIe software has a user-friendly graphical interface developed by the University of Pittsburgh. is software is available to students for free.
In response to questionnaires, people tend to choose some options of the questionnaires more, and the answers are somewhat biased. For example, over 85% of the answers to the questions in the error section of the driving behavior questionnaire are among the three options of never, rarely, and occasionally. One of the advantages of the BN approach is to consider this topic for future predictions. In the next section, a more detailed explanation will be provided.
To calculate the conditional probability on the BN, since each variable and indicator are asked by a few questions, the mean of the indicators is calculated and rounded up to the higher integer. For example, for the error variable, if the mean of responses is equal to 1.1, the driving error value is considered on a scale of 2.

Results from Structural Equation Modeling (SEM).
For SEM, SmartPLS (v 3.2.7) software was employed [33]. Over the past two years, more than 1,000 scientific articles have been published with the help of this software. e pro version of the software is simply available for free for 30 days. e constructed SEM model is shown in Figure 2. In the constructed model, among driving behavior variables, only the variable error had a meaningful relationship with the number of traffic collisions, and the remaining cases were not identified as a variable that could describe traffic collisions, and, as shown in Figure 2, they indirectly affected atfault collisions. e study of the quality and accuracy of the SEM model has many and varied indicators evaluating each part of the model. For instance, by studying the path coefficients and their statistical significance (t values), as shown in Figure 2, the significance of the structural pattern is examined.
To examine the quality of the SEM model, more than 7 indicators that examine the reliability of the indicators, the reliability of the model's integrity, the model's convergence validity, the model's segmental validity, and so on have been examined. Figure 2 illustrates the ultimate result of structural equation modeling. In this figure, only the statistically meaningful paths have been drawn. ere are two numbers written on each path. e first represents the influence coefficient of the related endogenous variable that is used for the prediction of the exogenous variable which the path leads to it. e second is equal to the t value. erefore, how larger the first number is means that the alteration of its related Journal of Advanced Transportation endogenous variable has more influence on the exogenous variable of the end of the path. At the significance level of 0.01, all path coefficients are meaningful (for more details refer to Figure 3). e SEM results showed that only the "error" factor, among the other factors of DBQ, has a statistically meaningful relationship with at-fault collision directly.
e results related to factor loadings of indexes are shown in Table 1. Fifty-eight percent of items have a factor loading of more than 0.7, which is the preferred value for the final survey of indexes. Moreover, the factor loadings of more than 42% of items are more than 0.4, which is the acceptable level for the final survey of indexes. Since education and the item that represents the level variable of GSD is one question ( e GSD question is, ''How do you rate your sleep in general?" Respondents are to choose the answer from these items: ''I sleep well," ''Occasionally I do not sleep well, but I am generally satisfied with my sleep," ''My sleep has already caused me problems," ''I think I have a problem with my sleep," ''I sleep badly," ''I sleep very badly" [34]), the value of its factor loading is equal to one that is demonstrated in Table 1. erefore, according to these results, the reliability of each of the observable variables was confirmed.
To confirm the reliability of the reflective measurement model, in addition to confirming the reliability of each of the observable variables, the composite reliability (CR) and average variance extracted (AVE) were extracted. Table 1 shows the calculated CR and AVE of each construct. e minimum acceptable values for the CR and AVE are 0.7 and 0.5 [34][35][36][37].
en, the Fornell-Larker test was used to assess the discriminant validity of the measurement model. e matrix shown in Table 2 is the correlation matrix of the latent variables of the model, except that the numbers on its diameter have been replaced by the square root of the AVE of the corresponding latent variable. According to the Fornell-Larker test, discriminant validity is confirmed if the numbers on the diameter are greater than the correlation values of that column [38]. Results show that this criterion is also met in this model.
In structural equation modeling through PLS, there are four main criteria for structural model testing that include 1-coefficient of determination index (R 2 ), 2-the significance of path coefficients, 3-Q 2 value, and 4-Cohen effect size criterion (f 2 ) [39]. e values of R 2 equal to 0.19, 0.33, and 0.67 are described as weak, medium, and significant, respectively. Significance of path coefficients means confirming the assumptions of the structural equation model. Q 2 -criteria above zero indicate that the observed values are well reconstructed. Q 2 equals 0.02, 0.15, and 0.35 are the weak, medium, and strong values for this index, respectively [36]. e Cohen criterion also shows the intensity of the relationship between the hidden variables of the model. Cohen introduced three values of 0.02, 0.15, and 0.35 for weak, medium, and strong effects, respectively [40].
According to the results of SEM in Tables 3 and 4, all the criteria for testing the overall fitness of the model are met. (BN). In this research, BN was used to answer the question of how likely heavy car drivers in Iran were likely to be involved in a collision. e problem we were faced within the first step was how to determine the relationships between the variables affecting the incidence of collisions. e relationship was determined using SEM in the previous section. In the next step, the probability of occurrence of each variable for the conditional probability associated with that variable with its parent node was obtained using

Results from Bayesian Network
According to equation (3), the probability of Y occurrence given that the event X has already occurred is equal to the number of observations that both X and Y variables e Rapidminer software was used to calculate the number of these cases and then they were introduced to the GeNIe software as the probability tables of the BN. For example, in Table 5, the probability of the collision occurrence for a heavy vehicle driver over the past three years has been set for different categories of driving errors. For all the nodes shown in Figure 2, conditional probability was introduced to the software.
In the previous, the BN construction process was fully described. In this section, the results obtained by the BN are discussed. Figure 4 demonstrates the results of the database on the BN for the time when we do not have any information about the driver's characteristics. According to Figure 4, if there is no information on the driver's characteristics, there is a probability of 17% that any heavy vehicle driver may have had an at-fault collision in the past three years, which can be used to predict the future. Meaning that in the next three years, there is a probability of 17% that any heavy vehicle driver may have an at-fault collision. e probability distributions of other variables are also depicted in Figure 4 for the case that there is no other information. Now, if any information is somehow obtained about any of these variables affecting the incidence of a collision by the SEM model of Figure 2, and the uncertainty of that variable is eliminated, the probability pertinent to the occurrence of the at-fault collision and other variables can be updated. For example, suppose a driver was asked the three questions regarding the amount of mobile phone usage, and the "often" level was specified. In this case, the probabilities of the BN were updated, and the probability of an at-fault collision of this driver over three years became 24%.
By examining different probability states and identifying different variables as the observed variable, the minimum and maximum probabilities pertinent to an at-fault collision of a driver were calculated according to the BN. e minimum probability of the at-fault collision occurrence for a heavy vehicle driver is when the result of the variable "error" of the driving behavior test is "Never". In this case, the probability of an at-fault collision for a heavy vehicle driver over 3 years is 8%. Also, the maximum probability of the atfault collision occurrence for a heavy vehicle driver is when the result of the variable "error" of the driving behavior test is identified as "Often"; in this case, the probability of an atfault collision for a driver is equal to 60% over three years. Table 6 presents the probabilities of the at-fault collision occurrence for heavy vehicle drivers for three years considering all variables except one.
One of the disadvantages of the variable "error" in driving behavior test is that, depending on the nature of the questions, drivers may not answer these questions with full integrity; however, variables such as exposure, education level, mobile phone usage, daily fatigue, and sleep satisfaction criterion can be verified and are easier to evaluate. Furthermore, drivers are less inclined to express values such as "Always" and "Repeatedly" for some offenses, and thus, having several variables of this type and the probabilities related to them can help to predict probabilities pertinent to the at-fault collision occurrence of a driver more accurately.

Discussion
e results of the SEM have been compared to some similar studies across the world in Table 7. Although plenty of researches showed that driving violations have a direct effect on collision numbers [7,8,[10][11][12], but studies such as this research and Lucidi et al. showed that there is no significant relationship between driving violations and collision numbers [6]. However, research by Lucidi et al. showed that      [6,11,12]. e results of the present study confirmed the second category studies results. e results of the structural equation model showed that among all the variables studied in this research, only the error variable is directly related to the rate of traffic collisions.
According to the results of some studies that have examined the effect of exposure, the more the drivers are exposed, the higher the number of their collisions [11,41]. Other studies show that there is no significant relationship between exposure and the number of collisions [8,12,42].
e results of the present study showed that the increase in exposure is not directly related to the rate of collisions, but it leads to an increase in collisions through mediators such as mobile usage, GSD, and error.
According to Radun et al. and Shams et al. studies [15,42], drivers' dissatisfaction with their sleep and its poor quality have led to an increase in the number of accidents. Unlike the results of these studies, no significant direct relationship between sleep quality and health with collision was discovered in this research. But the results of SEM showed that the GSD was directly related to the error. erefore, it can be said that the more the driver is dissatisfied with his sleep, the more the error increases and the higher the probability of an accident.
Also, the results of the current study showed that the more a driver uses a cell phone while driving, the more likely he/she is to crash. is result is compatible with other studies such as Chen study [23].
e results of this research show that among all the variables studied in this study, only the error variable directly affects the number of traffic collisions. On the other hand, the error variable is influenced by the latent variables including mobile usage, the amount of fatigue and boredom in the day (daily fatigue), and sleep dissatisfaction (GSD).
is means that the more the driver uses the mobile phone while driving, the more tired and drowsy he feels, the more he will make a driving error and be exposed to a collision. e variable of mobile usage is affected by two latent variables of education level and exposure, and the higher the value of these two variables, the more mobile phone uses by the driver. It is natural that the higher the level of literacy of e "-" sign shows that the variable had not been studied. e "↑" sign means that this variable had an incremental effect on collision numbers and the "▲" means this variable had an incremental effect on collision numbers through other variables. e "↓" sign means that this variable had a decreasing effect on collision numbers and finally the "○" means that this variable has no meaningful effect on collision numbers. the driver, the more it is possible for him/her to use mobile applications, send text messages and read text messages. Also, the longer the driver is driving, the more exposed he/ she is to answering incoming calls or making calls, sending and reading text messages.

Conclusions
In this study, to construct the BN of the relationship between the characteristics of heavy vehicle drivers and their at-fault collisions, first, a model depicting the relationships between human factors affecting at-fault collisions was required. For this purpose, the main structure of the BN was developed using the SEM. en, the conditional probability table for the developed BN was calculated with the help of the Rapidminer software and introduced to the GeNIe software used for Bayesian calculations. e following results were obtained with the help of the BN: (i) In this research, an innovative method of using the SEM in the construction of BN was employed to study the behavioral sciences of humans. (ii) If there is no information about the characteristics of a heavy vehicle driver, the probability of an atfault collision over the next 3 years for this driver is 17%. (iii) According to the SEM model developed in this study, only the "driving error" factor directly affects the incidence of collisions involving heavy vehicles. (iv) Factors "slip," "ordinary violations," and "aggressive violations" were not recognized effective in the incidence of at-fault collisions of heavy vehicle drivers.
Sleep dissatisfaction is one of the most important factors in the collision occurrence of heavy vehicles, which can increase the chance of a collision occurrence to 32%. A way to reduce the at-fault collision occurrence of heavy vehicle drivers is to increase the sleep satisfaction of drivers by setting daily driving hours limits for heavy vehicle drivers.
Eventually, with the help of the obtained network, the likelihood of the at-fault collision occurrence involving a heavy vehicle driver can be calculated using a certain amount of information, including sleep quality, daily fatigue, and the amount of mobile phone usage.
It is expected that this model will help insurance companies in the calculation of annual third-party insurance rates, traffic police in the renewal of licenses, freight shipping companies in hiring drivers, and legislators in macro policies to enforce working hour's rules or driving experience to make better decisions.
e BN developed in this study can help freight shipping companies to make decisions on hiring drivers, insurance companies to determine the annual third-party insurance rates in proportion to the risk of at-fault collision occurrence for each driver, and traffic police to decide on the renewal of heavy vehicle drivers' licenses or to require them to take part in retraining courses or to impose heavier fines for mobile phone usage while driving.