ADataMining Approach on Lorry Drivers Overloading in Tehran Urban Roads

(e aim of this study is to identify the important factors influencing overloading of commercial vehicles on Tehran’s urban roads. (e weight information of commercial freight vehicles was collected using a pair of portable scales besides other information needed including driver information, vehicle features, load, and travel details by completing a questionnaire. (e results showed that the highest probability of overloading is for construction loads. Further, the analysis of the results in the lorry type section shows that the least likely occurrence of overloading is among pickup truck drivers such that this likelihood within this group was one-third among Nissan and small truck drivers. Also, the results of modeling the type of route showed that the highest likelihood of overloading is for internal loads (origin and destination inside Tehran), and the least probability of overloading is for suburban trips (origin and destination outside of Tehran). Considering the type of load packing as a variable, the results of binary regression model analysis showed that the most probability of overloading occurs for packed (boxed) loads. Finally, it was concluded that drivers are 18 times more likely to commit overloading on weekends than on weekdays.


Introduction
Transport and logistics play an important role in the economies of countries, specifically road transport which is one of the most important modes of transport for freight and cargo transportation in developing countries [1].
In Iran, like other developing countries, freight transportation, mainly conducted by semiheavy and heavy vehicles, plays an important role in the distribution of exports and imports. According to the Ministry of Roads and Urban Development in 2017, the amount of goods transported in the country was 428.3 million tons conveyed through 29.9 million trips by heavy vehicles, which yielded an index of 224.836 million ton-km in this period. e increasing freight traffic on the roads and consequently the increased likelihood of vehicles colliding have become a major concern for drivers and policy makers [2]. Traffic accidents are directly related to traffic offenses; in other words, traffic offenses are among the most important factors that lead to traffic accidents [3]. Drivers with more offenses experience more accidents [4]. erefore, reducing driving offenses can reduce accidents. e study of the causes of accidents shows that errors and violations are the main cause of 74% of accidents [5]. is requires researchers to identify the factors that lead to driving offenses and to reduce the impact of accidents. Most studies mention that accidents depend on three factors: humans, vehicles, and roads [6]. And the human factor is the most important factor in the analysis of traffic accidents [7]. Drivers' offenses are therefore among the most important human factors leading to accidents, which have been used in many studies to investigate driving behaviors that are related to behavioral characteristics and drivers characteristics [6][7][8][9][10][11][12]. According to the Ministry of Roads and Urban Development of Iran, there have been 63472 offenses in 2017 resulting in 121108 rural crashes leading to 16201 fatalities and 335995 injuries. Meanwhile, commercial vehicles drivers have a high importance in reducing traffic offenses and subsequent accidents due to the different dimensions and weight of their vehicles, besides their higher traffic rates as a group of professional drivers [13]. Studies show that heavy vehicle driving is among the highest risk occupations for injury and death [14]. Heavy vehicles have fewer accidents considering their mileage; nevertheless, a very high percentage of traffic accidents are attributed to heavy vehicle accidents [15].
Additionally, among the costliest cargo fleet offenses by overloaded trucks, especially in developing countries, are the safety and structural problems caused by the demolition of bridges and pavements [16]. Overload is the amount of cargo that exceeds the legal value of the truck's carrying capacity [17]. Overloading creates several problems, including accidents, increasing the severity of accidents, damaging infrastructure [18][19][20], and creating a market with unfair competition between modes of transport and transportation carriers [16]. Overloading is commonplace in developed countries and is dealt with systematically via additional licenses. It is identified and enforced by using off-road weighing systems (WIM).
However, in developing countries this has been overlooked or neglected. According to official statistics released by the Ministry of Roads and Urban Development of Iran, a total of 5 WIM devices have been used to date, none of which are used for urban routes. As the capital of Iran, Tehran has 650 kilometers of urban road crossings, with a high volume of semiheavy-and heavy-duty freight vehicles traveling in and out of Tehran. According to statistics released by the Tehran Transportation and Traffic Organization, a total of about 128 million kilometers have been traveled in highways within Tehran in the past year by freight vehicles, 23% of which are overloaded. erefore, the increase in the volume of freight vehicles with overloading in the urban roads of Tehran causes significant damage to urban infrastructures such as bridges and pavements, which will make the need for further studies in this regard.

Literature Review
Despite the importance and necessity of controlling overloading infringement in the freight fleet mentioned in the introductory section, few studies have been carried out in this regard and only descriptive statistics of overloading have been investigated. e parameters and characteristics affecting the commission of overloading have not been addressed. In a general classification, we can divide the total number of studies on cargo fleet offenses into three categories, which are displayed below.
Some studies have investigated the factors affecting the occurrence of accidents and devised models of accident prediction for cargo fleets. Among the most important parameters affecting the occurrence of commercial vehicle accidents reported in previous studies have been the age and the working hours of the driver [1][2][3][4]12]. Researchers have also concluded in other studies that factors such as drowsiness, fatigue, and the way in which salaries are paid increase the risk of accidents [8]. Other variables used in modeling commercial vehicle accidents include driving experience [3,4,6], physical health characteristics [3,4], sleep duration [2,6,12], mileage [6,12], and gender [1-3, 5-7].
In some other studies, the effect of committing driving offenses on the occurrence of commercial vehicle crashes has been considered. One of the most important offenses identified in past studies as a major contributing factor to the occurrence of accidents is speeding [9-11, 13-15, 21-23]. Other studies in this field have found that offenses such as maintaining safe distance to front vehicle [21,23], fastening seat belt [14,22], technical defect [13,22], alcohol abuse [15,22], and other related factors such as a history of a crash [14,21,22], a history of misconduct [24][25][26], and some parameters of the Driving Behavior Questionnaire [9,11,21] have a significant impact on the occurrence of accidents for commercial fleets.
Other studies have also addressed the issue of offenses among commercial vehicle drivers and the variables affecting them. e parameters studied in this category included driving behavior and behavioral characteristics, driver demographic information [19,27], mileage [2,14,28], and fatigue/drowsiness [10,23]. e results of some studies show that there is a significant relationship between driving experience and driving offenses such as speeding and not fastening seat belts [2,25]. In other studies, it was understood that the amount of cognitive errors [12], individual behaviors, anger and perceived behavior control [29], differences in drivers' behavior, and the price of heavy-duty vehicles have a significant association with committing traffic offenses. Overall, studies of offenses of commercial vehicle drivers can be summarized in Table 1.
As summarized in previous section, no study has specifically addressed the factors affecting the committing of overloading by drivers. While the above studies were carried out using information obtained from on-board weighing devices (approximate weighing), the use of accurate weighing information from portable scales was not conducted. Additionally, none of the previous studies were for urban routes. Variables such as type of cargo, route (origin, destination, and route), cargo characteristics including type of packing and dimensions of the cargo, age and gender of driver, and vehicle life are evaluated for the first time in this study.

Methodology
e aim of this study is to identify the important factors influencing overloading on Tehran's urban roads. To achieve this goal, all independent variables are categorized first, and the chi-square Pearson test with a p value of 0.05 is used to investigate the relationship between each independent variable and the dependent variable that is driving offenses in this research. en, with the significant variables in committing known driving offenses, a binary logistic regression model was developed, and using this method, the effect of variables on the perpetration of infringements by lorry drivers was evaluated. It should be noted in this study that driving offenses were classified into two categories as dependent variables: overloading and nonoverloading; the effective factors identified within the category for each truck comprised driver, vehicle, mileage, etc.
In the next step, after identifying meaningful variables, binary logistic regression modeling has been used to construct the driving offenses model. Logistic regression is usually used to categorize discrete variables. ese models can be used to categorize binary response variables, such as variables with two solutions, and also can be used for response variables with r category (r can be greater than 2). ese models use the r − 1 logit model formatting for response variables, so that each of the variable's classifications can be compared with the reference classification. In this study, because the dependent variable is a binary nominal variable, binary logistic regression is used for modeling.
In this study, the dependent variables were defined in two categories: "overloading" and "nonoverloading"; "overloading" driving offenses have been used as a reference classification to be compared with nonoverloading.
Driving offenses displayed with Y is a response variable, and the environmental and human variables, vehicle, and mileage are the response variables and are represented by x i1 , x i2 , x i3 , . . . , x ip , where i is the number of observations and p represents the number of independent variables. It is assumed that When the categories 1, 2, . . . and the response variable's r are irregular, i is associated with independent variables through the set of r − 1 basic classification of logit function.
Since 2 classifications for the response variable in this study do not have any particular order, 1 generalized logit model is defined for calculations and represented by j. Also, since x i has a number of P, this model has (r − 1) × p parameters arranged in matrix form.
In this model: Each of the classifications is selected as a reference classification. In this case, only the amount and manner of interpretation of the coefficients will be different. e k element of β j is considered as an agent for increasing the chance of placing the answer in the j classification versus the j * classification. Hence, an increment in the k independent variable occurs, and at the same time, other independent variables remain constant. For nonbase classifications j ≠ j * , i is defined using β as follows: To classify the base (reference), i is defined using β as follows [30]: In this study, overloading driving offense is used as a reference classification. SPSS 24 software is used for statistical analysis of binary logistic regression model.

Materials (Data Collection)
Field data collection method was used to collect the required information in this study. us, initially, traffic volumes at the inbound and outbound routes of Tehran, as shown in Table 2, were obtained from the Road and Transportation Organization for one hour each day to use the vehicle traffic information. e urban highways have the highest freight traffic as well as the peak hour of pickup. After selecting the highways with the heaviest traffic and also the peak hour for field data gathering, suitable locations for stopping freight vehicles along the road to collect their information were selected. e survey consisted of 10 stations on the Tehran urban highways according to Figure 1, selected with considerations such as sufficient space for truck lanes, low road slope, sufficient stop-gap distance, appropriate length of merging areas, and brightness. After selecting the 10 stations, field data collection operations were carried out in cooperation with Tehran Traffic Police during 30 working days in July and August of 2018. In this data collection, the cargo vehicles were first accidentally stopped by the traffic police and their weighting information was recorded using a pair of portable scales. Regarding the number of data collection days, the following explanations are provided: is number of collection days was only on weekdays in Tehran, i.e., Saturday to Wednesday. is means that the collection field operation took about two months. Statistical calculations through Cochran's relationships showed that 560 information records were needed to perform statistical analysis according to the size of the statistical population and the statistical sample. However, due to the provision of about twice the above record, there was no need to continue collecting information.
Collecting field data using portable scales adjacent to intercity highways is a costly and high-risk operation due to the possibility of collisions with cargo vehicles. erefore, due to the collection of a sufficient number of statistical samples in 30 working days, there was no need to continue field operations. Subsequently other information needed, including driver information, vehicle, cargo, and travel, was obtained by completing the questionnaire; after filtering and correcting incomplete data, 856 data records were used for statistical analysis.
en, the type of overloaded truck drivers was selected as the dependent variable in two categories. For the analysis, 12 independent variables related to driver, vehicle, cargo, and travel as shown in Table 3 were selected. Kendall's nonparametric test (discrete variables) was used to investigate the dependence of the independent variables. e results showed that all the independent variables have a correlation coefficient less than 0.5, and therefore the independent variables are not highly correlated. In Table 3, the classifications considered for all variables are shown along with the percentage of frequency for each of them. All independent variables were classified and SPSS 24 software was used for statistical analysis.

Results and Analysis
In this study, the effect of each of the independent variables on the overloading offense of truck drivers was evaluated, and the results of chi-square test are shown in Table 4. As indicated by the results of the chi-square test, 7 variables out of the 12 independent variables studied were significant at 95% confidence level (Sig < 0.05). In addition, binary logistic regression model was used to analyze the data and identify the factors affecting the overloading of cargo fleet drivers' offense, and the forward likelihood ratio prediction method was used to develop the model in SPSS software. All significant variables in describing the proposed model of this study were identified and entered into the model in the first step. e modeling results for overloading in two classifications are presented in Table 5. It should be noted that in the classifications defined for the model variables, the last classification has been evaluated as the reference. e results displayed in Table 5 (output of the statistical model) show that the most probability of committing overloading occurs among freight drivers with a capacity of 3.5 to 19 tons. is probability is about 26 times higher than that of cargo fleet drivers with a capacity of more than 40 tons. Also, the analysis of the results in the lorry type section shows that the least likely occurrence of overloading is among pickup truck drivers such that this likelihood within this group was one-third among Nissan and small truck drivers.
Considering the type of load packing as a variable, the results of binary regression model analysis showed that the most probability of overloading occurs for packed (boxed) loads. is higher probability is about 20 percent more likely than cargo of bulk goods. Further, and in accordance with the results of regression analysis in Table 5 and in the load type section, the results showed that the maximum likelihood of overloading transport within urban areas was for Journal of Advanced Transportation loads of construction materials. e probability of overtonnage for this type of load was 10, 6, and 2.5 times higher than that for other loads, scrap, and metal loads, respectively. Modeling results in traffic type section also showed that the highest likelihood of overload was for inner-city loads (origin and destination inside Tehran), and the least probability of overloading was for passing cargos (origin and destination outside of Tehran). e probability of carrying excess tonnage for inner-city loads is 5 and 1.5 times the passing cargo, respectively. Finally, it was concluded that heavy vehicle drivers are 18 times more likely to commit overloading offense on weekends than on weekdays, and drivers commit more overload offenses on holidays.
e trend of changes in the variables affecting the overloading of truck drivers is shown in Figure 2. As the diagrams analyzed show, changes in some of the independent variables in the defined classification have a significant trend that will be described further in the study. e descriptive analysis of the variables in the load type section shows that the highest overload ratio occurs for scrap and construction type loads. As shown in Figure 2, about 74 percent of litter trucks in Tehran are overloaded. One of the most important reasons for the high percentage of overweight loads for these types of cargo is the lack of weighing tools at the source or destination. Considering the route of the overloaded cargos, the freight vehicles passing Tehran have the lowest percentage of overloading. It seems that one of the most important reasons for the reduction of overloading offenses for this group of drivers is the weight control provision of heavy vehicles for intercity loads. e graph of the descriptive analysis of the type of registration plate also shows that the highest amount of tonnage occurs between freight vehicles with general plates and the least for personal plates. Drivers who own heavy vehicles appear to be more attentive to their vehicles and are less likely to overload them due to the potential deleterious effects of overloading on the vehicle.

Conclusion
e main purpose of this study was to investigate important and influential factors in cargo drivers committing the overloading and excess tonnage offense in Tehran. Field collection method was used to collect the required information. Initially, the highways with the most freight traffic, as well as suitable locations for stopping the vehicles along the road for field surveys, were selected, including 10 stations on the Tehran urban roads. e field collection operation was then carried out in cooperation with Tehran Traffic Police authorities. e heavy and semiheavy vehicles were initially stopped by the traffic police and their weighting information was collected using a pair of portable scales. en, other information needed including driver information, vehicle, cargo, and travel was completed by filling out a questionnaire, and after correcting or deleting incomplete data, 856 data records were used for statistical analysis. e results showed that the highest likelihood of overloaded transport was obtained in inner-city trips carrying construction material loads. Overloading of this type of load often happens for two main reasons. Firstly, currently in Tehran, due to city traffic, these trips are shifted to night, with traffic police agents being less present in the roadways and as a result having less control over the transportation of cargo. Secondly, these loads do not have a recorded document of loading and are not weighed at the origin. As elaborated, transporting such loads in addition to reducing traffic safety may cause damage to urban infrastructure. It seems that if the weighing devices are used in the dumping of construction waste in the city of Tehran for preventing the violation of the law by offensive drivers, the amount of traffic in the type of waste dumping and construction carrying overloading in the inner-city roadways can be decreased.
Also, the results of modeling for the origin destinations indicated that the highest likelihood of overloading was for inner-city loads (origin and destination inside Tehran), and the least probability of overloading was for trips with origin and destination outside of Tehran. One of the most important reasons for the less tonnage load of passing trips is the weighing of freight vehicles at police checkpoints using offroad weighing scales in suburban areas. Since it is not possible to use fixed weighing scales in the inner-city axes, the use of weighing-in-motion (WIM) scales is especially important in the urban highways which have high freight traffic. is can lead to a reduction in the overloading in these routes.
Finally, it was concluded that truck drivers are 18 times more likely to commit excessive tonnage offense on weekends than on weekdays. In this regard, it seems that the presence of law enforcement agents on weekends is less than on working days. In addition to the presence of most traffic police agents on weekends on the heaviest freight routes, the use of weighing-in-motion (WIM) scales at urban highways, as well as the requirement for freight companies to implement rigorous freight measurements, is suggested to reduce overloading, especially on urban roads.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.  Journal of Advanced Transportation 9