Analysis Model of Risk Factors of Urban Bus Operation Based on FTA-CLR

. In order to comprehensively analyze the risk factors and accurately ﬁnd the high risk factors related to accidents, an analysis model of risk factors of urban bus operation is proposed, in which the advantages of the structural analysis of the Fault Tree Analysis (FTA) and the correlation analysis of the Cumulative Logistic Regression (CLR) are combined. Firstly, based on the accident data in Northeast China, FTA is used to compile the urban bus operation fault tree. In the fault tree, 16 bus operation risk factors are classiﬁed, while the risk factors are sorted and compared from three aspects: structural importance, probability importance, and critical importance. And then, the 11 higher risk factors are selected according to the discriminant principle. Secondly, bus operation accidents are divided into fatal accidents, injury accidents, and major economic loss accidents. The CLR model is used to ﬁt the much higher risk factors that lead to urban bus operation accidents from above 11 higher risk factors. Finally, the scientiﬁc rationality and applicability of the model are veriﬁed, through the goodness of ﬁt test and the comparison test with the actual probability of occurrence.


Introduction
Due to the large number of passengers transported by bus, casualties are extremely likely to occur in event of an operation accident. Furthermore, the characteristics of urban bus operation accidents are different from other forms of road traffic, so the risk factors of bus operation accidents have become people's main concerns. Some scholars conduct research on the influence degree of road infrastructure on public transportation safety, the characteristics and weights of basic indicators of urban public transportation, factors affecting the safety of bus operations, and preventive and improvement measures [1][2][3]. Some scholars explore the relationship between the personality characteristics of bus drivers, driving behaviors, and accident risk [4,5]. Other scholars explore bus accidents in the United States and find out that the severity of bus accidents is related to the age, gender, and dangerous driving behavior of bus drivers [6]. For the analysis of traffic safety and the causes of traffic accidents, the scholar uses the Fuzzy Fault Tree Analysis to evaluate the safety of the public transportation system [7].
A considerable number of authors research on bus crash severity in recent years. In order to cluster bus crash data by dividing them into homogeneous categories, some scholars analyse data involved bus crashes in the State of Victoria, Australia, in which the factors affecting fatality in bus involved crashes are extracted by implementing association rules discovery on the clusters [8]. In order to solve the problem of the lack of reporting of some of the factors involved in the crash and the nonuniform distribution of factors affecting the crashes leading to the inherent heterogeneity of crash data, the Latent Class Cluster (LCC) regression approach is used to reveal important, formerly hidden relationships in traffic safety analyses [9,10]. Furthermore, Fault Tree Analysis (FTA) and Bayesian network are widely used in accident Analysis, and many successful applications are found in the fields of transportation, fuel cell degradation, biogas system, chemical safety, and power distribution [11][12][13][14]. In particular, some scholars compile a road traffic fault tree using the Fault Tree Analysis method, find out the influencing factors of road traffic accidents, and analyse the key influencing factors of the severity of the accident with the Logistic model [15]. Due to the fact that our focus is to excavate and rank all the accident risk factors for urban bus operations as well as the insufficient data about the type and severity of injuries of urban bus operations, the FTA with the advantages of structural analysis is a suitable method.
en, other scholars believe that it is necessary to classify traffic accidents in order to enhance the homogeneity of accident data [16]. Some scholars use the Probit model, the Binomial Logistic model, and the Multinomial Logistic model to analyse the factors affecting the severity of accidents at three-way and four-way no signal intersections [17]. Because of the good applicability of the Logistic Regression model, it can better assess risk factors of bus involved accident severity including both discrete and continuous variables. Feng et al. divide drivers in US bus accidents into three categories, no traffic violation records, traffic violation records (middle-aged), and traffic violation records (young and old), and analyse the factors affecting the severity of accidents in each category of drivers [18]. Nasri and Aghabayk investigate the underlying risk factors associated with the severity of urban transit bus accidents in Mashhad, Iran, by estimating a binary logit model, in which the accident severity is divided into PDO (Property Damage Only) and injury or fatal categories, and the effect of several risk factors on the probability of bus accidents with higher severity is examined [19]. Some scholars use the real bus operation accident data in Guangdong Province of China as an example and use the Binomial Logistic model to analyse the factors affecting the severity of bus accidents from four scenarios: overall accidents, intervehicle accidents, vehicle-pedestrian accidents, and single-vehicle accidents [20].
FTA can well dig out all the risk factors of accidents based on urban bus operation characteristics, classify all factors, and present them clearly in a structured manner. In addition, CLR would get the desirable analysis results depending on the suitable set of risk factors. In this research, it is not only necessary to dig out all the accident risk factors of urban bus operation but also it is needed to sort all risk factors and find higher risk factors by the Pearson χ 2 statistic and Deviance statistic goodness of fit test. erefore, this paper combines FTA and CLR to find out the higher risk factors of urban bus operation accidents.

Materials and Methods
Because the operating environment of urban buses is more complex and involves many influencing factors, FTA can classify the risk factors that lead to accidents of urban bus operation. e determination of the minimum cut set can present the set of each risk factor to clarify the structural importance of each risk factor. However, the analysis of structural importance only analyses the impact of each risk factor on bus operation accidents according to the structure of the fault tree. What we are concerned about is the impact of the occurrence probability of each risk factor on the probability of bus operation accidents and how to measure which is easier to reduce the risk factors with high probability or low probability. erefore, the probability importance and critical importance of each risk factor have practical significance to the extraction of high risk factors.
ere are a number of requirements and assumptions in using Logistic Regression models. e most important ones of them are IIA (Independence of Irrelevant Alternatives), linearity of independent variables and log odds, little or no multicollinearity among the independent variables, and the independence of the observations. Even so, it is flexible in application and simple in calculation, which can make the results of the analysis more objective. e utilized data in this article could meet these requirements and assumptions.
e existing classification of the severity of traffic accidents is mostly based on the most severe injury imposed on the passengers. In fact, heavy economic losses should also be an important consideration in the severity of the accident. Since the dependent variable is set for multiple values, and the independent variable includes both discrete and continuous variables, the Cumulative Logistic model in the multivariate Logistic model is the most appropriate.
Based on the above reasons, an analytical model of the risk factors of urban bus operation is constructed. e steps of the FTA-CLR analysis model are as follows. e first step is to compile the fault tree. Based on the bus operation accident data in a certain area, the accident is regarded as the top event. According to accident types and causes, risk factors are classified and connected with "OR" + or "AND" . . en, the next top events are analysed until the basic events can no longer be separated. e second step is the calculation of importance. ree kinds of importance are calculated, which are structural importance, probability importance, and critical importance. [21] is used to solve the minimum cut set of the fault tree of bus operation accidents. Formula (1) [22] is used to calculate the coefficient of structural importance of risk factors. e greater the value of I φ (i), the greater the structural importance coefficient. According to the criterion of structural importance, if the minimum cut set contains only one basic event, the structural importance of the risk factor represented by the basic event is the largest. If the minimum cut set contains inconsistent basic events, the structural importance of the risk factors represented by the basic events of the minimum cut set with fewer basic events is large:

Structural Importance. Boolean algebra
where I φ (i) represents the structural importance coefficient of the basic event X(i). n i represents the minimum cut set capacity of the basic event X(i). G r represents the minimum cut set containing the basic event X(i).  [23] is used to find out the degree which means the change in the probability of the i-th basic event on the change in the probability of the top event. It is the coefficient of probability importance of risk factors. e larger the value of I g (i), the higher the probability of the basic event: where I g (i) is the probability importance coefficient of the basic event X(i). P(T) represents the occurrence probability function of the top event. q i represents the probability of occurrence of the i-th basic event X(i), which is calculated based on the data of bus operation accidents.

Critical Importance.
Formula (3) [23] is used to measure the importance of each basic event in terms of sensitivity and its own probability of occurrence. It is the coefficient of the critical importance of risk factors. e larger the value of I c g (i), the greater impact on the probability of occurrence of the top event: where I c g (i) represents the critical importance coefficient of the i-th basic event. p(T) represents the probability of occurrence of the top event.
e third step is sorting and analysis. e risk factors are sorted by structural importance coefficient, probability importance coefficient, and critical importance coefficient. After the coefficients are normalized, higher risk factors are selected according to the discriminant principle. Discriminant principle: ① the order of structural importance coefficient and probability importance coefficient is the main basis, and the critical importance coefficient is the auxiliary basis and ② in the order of the main basis, the cumulative sum of the normalized index of each importance is above 90%. e fourth step is to analyse the severity of bus operation accidents based on the CLR. e probability calculation model is shown in formula (4) [15]. e mixed stepwise selection method [15] is used to gradually eliminate independent variables with insignificant coefficients.
at is independent variables with a significant level of Sig. < 0.05, and through correlation analysis to determine significantly related risk factors: where j j�1 P(y � j/X) � 1, j � 1, 2, . . . , j − 1, P(Y ≤ j/X) is cumulative probability. X contains i independent variables, such as X 1 , X 2 , . . . , X i . I is the number of independent variables. α i is a constant term. β ij is the regression coefficient. e fifth step is the model test. According to the Pearson χ 2 statistic and Deviance statistic goodness of fit test [24], if the significance level of the Pearson χ 2 statistic and Deviance statistic is more than 0.05, it indicates that the model fits well. en, compare the predicted probability of the accident type calculated by the model with the actual probability to determine its accuracy.
e FTA-CLR model analysis process is shown in Figure 1.

Compilation of Fault Tree.
Taking the data of bus operation accidents in Northeast China as an example, the basic events of urban bus operation accidents are summarized, and the fault tree of urban bus operation is compiled.
Because the cause of urban bus operation accidents is related to many factors, accidents of urban bus operation are divided into two types: traffic accidents and service accidents.
erefore, the accident is regarded as the top event; the fault tree of traffic accidents and the fault tree of service accidents are compiled separately. ere are three main forms of bus operation accidents: vehicle-fixed object, vehicle-pedestrian, and vehicle-vehicle. Any one of them will cause accidents, so the OR door is used to connect. Continue to use this method to analyse subtop events until the basic events can no longer be separated. At this point, the construction of the fault tree of traffic accidents is completed. e construction method of fault tree of service accidents is the same. e fault tree of traffic accidents and the fault tree of service accidents are shown in Figures 2 and 3, respectively.

Calculation of Importance.
Because service accidents of the urban bus generally do not cause serious accidents, this paper only analyses the risk factors that may cause serious accidents in urban bus operation. e minimum cut set of the fault tree of traffic accidents in urban bus operation is obtained as follows: A total of 15 minimum cut sets of risk factors of traffic accidents in urban bus operation are obtained, of which there are 9 minimum cut sets of a single factor. It shows that if one of the influencing factors exists in the actual operation of the urban bus, it is very likely to cause traffic accidents. en, it is of practical significance to find out the higher risk factors leading to bus operation accidents. e structural importance coefficients of the risk factors affecting accidents of urban bus operation are obtained. As shown in the second column of Table 1, the order is as follows:

Advances in Civil Engineering
e probability importance coefficients of the risk factors affecting accidents of urban bus operation are obtained. As shown in the third column of Table 1, the order is as follows: I g (13) > I g (2) > I g (10) > I g (1) > I g (4) > I g (14) > I g (5) > I g (3) � I g (8) > I g (9) � I g (12) > I g (15) > I g (11) > I g (6) � I g (7) � I g (16).   Table 1, the order is as follows: 3.3. Sorting and Analysis. 16 risk factors are obtained by classifying the basic events determined by the bus traffic fault tree. e analysis of structural importance, probability importance, and critical importance is carried out. According to the sorting order of the main basis, the cumulative sum of the normalized index of each importance is shown in Table 2.
It can be seen that the cumulative sum of the normalized index of each importance of the first 11 risk factors is above 90%. At the same time, from the first 11 to 12 risk factors, the increase of the cumulative sum of the normalized index based on the critical importance is not obvious. erefore, the first 11 risk factors are selected as higher risk factors. ey are 16-25 years old or older, have less than 5 years of  Advances in Civil Engineering driving experience, female driver, nonsunny and low visibility, slippery road, no street lighting at night, no signal control, no protective facilities on the roadside, damaged road surface, driving straight, and driving at intersections.

Dependent Variable.
When studying the significantly related risk factors of urban bus accidents, this paper lists only major economic losses into the classification of urban bus accidents. is paper comprehensively divides the severity of bus operation accidents into three levels. e code is Y � 1 for fatal accidents, Y � 2 for injury accidents, and Y � 3 for only major economic loss accidents.

Independent Variable.
Based on the analysis of the importance of basic events in the bus operation fault tree compiled in this paper, from the perspective of people,   Driving state Nonstraight � 0; straight � 1 X 14 Intersection or section Road section � 0; intersection � 1 vehicles, roads, and environment in the transportation system, 11 independent variable factors are determined. e independent variable coding is shown in Table 3.

3.4.3.
Fitting. e independent variables with significance level Sig. < 0.05 are gradually eliminated. rough correlation analysis, 4 significantly related risk factors are determined.
ere are driving experience (X 2 ), lighting conditions (X 8 ), traffic signal mode (X 9 ), and driving state (X 13 ). e model calibration results are shown in Table 4.

Model Test.
In this way, the Cumulative Logistic probability prediction model of urban bus fatal accidents and injury accidents is obtained. e results of goodness of fit test show that the Pearson χ 2 statistic has a significant level of 0.998 > 0.05, and the Deviance statistic has a significant level of 0.969 > 0.05. It shows that the model fits well. en, 1390 cases of the data of urban bus accidents with complete information in Northeast China are selected. e Cumulative Logistic probability prediction model constructed in this paper is used to predict and calculate. e predicted probabilities of fatal accidents, injury accidents, and only major economic loss accidents are compared with the actual probability of occurrence, as shown in Table 5. From the comparative analysis, it can be seen that the errors are less than 5%, which shows that the accuracy of the constructed model for predicting the severity of urban bus operation accidents is relatively high. Data Availability e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.