MPE Mathematical Problems in Engineering 1563-5147 1024-123X Hindawi Publishing Corporation 508039 10.1155/2014/508039 508039 Research Article Traffic Incident Clearance Time and Arrival Time Prediction Based on Hazard Models Ji Yang beibei 1 http://orcid.org/0000-0002-6340-7030 Jiang Rui 2 http://orcid.org/0000-0002-9969-2320 Qu Ming 2 http://orcid.org/0000-0001-6969-7764 Chung Edward 2 Zhang X. 1 School of Management Shanghai University Shangda Road 99 Shanghai China shu.edu.cn 2 Smart Transport Research Centre Queensland University of Technology Level 8 P Block Brisbane, QLD 4001 Australia qut.edu.au 2014 1742014 2014 25 02 2014 31 03 2014 17 4 2014 2014 Copyright © 2014 Yang beibei Ji et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Accurate prediction of incident duration is not only important information of Traffic Incident Management System, but also an effective input for travel time prediction. In this paper, the hazard based prediction models are developed for both incident clearance time and arrival time. The data are obtained from the Queensland Department of Transport and Main Roads’ STREAMS Incident Management System (SIMS) for one year ending in November 2010. The best fitting distributions are drawn for both clearance and arrival time for 3 types of incident: crash, stationary vehicle, and hazard. The results show that Gamma, Log-logistic, and Weibull are the best fit for crash, stationary vehicle, and hazard incident, respectively. The obvious impact factors are given for crash clearance time and arrival time. The quantitative influences for crash and hazard incident are presented for both clearance and arrival. The model accuracy is analyzed at the end.

1. Introduction

Traffic incident is considered as one of the major factors which cause traffic congestion and delay. At the same time traffic incident has wide negative impact on both traffic system and related social activities. Many studies have been done to predict, estimate, or try to decrease traffic duration time just because the accurate prediction of incident time can (1) reduce incident duration time, (2) associate Traffic Incident Management (TIM) to quickly respond to incident to mitigate the impact of incidents (Chou ), and (3) improve travel time reliability by predicting travel time while occurrence of incident (Tsubota et al. ).

Traffic incident is nonrecurrent events which cause a capacity reduction or an abnormal increase in traffic demand, such as crash accident, stalled vehicles, debris, fire, construction, and sporting events. A general incident timeline as shown in Figure 1 reveals that incident duration can be divided into verification, response, clearance, and recovery period by recording timestamps at various stage of an incident.

Traffic incidents timeline.

However, most of the prediction models did not include all four parts or did not give the exact definition of incident time. Each part of incident time has statistic distribution and has different influence factors. An example is given in Figure 2.

Contour plot on the incident scene.

Figure 2 is the occupancy contour plot from 3:00 a.m. to 11:00 a.m., in which the y-axis denotes the distance from start point of the freeway and x-axis indicates the time. The green indicates low occupancy (free flow conditions), the yellow indicates increasing congestion, and the red represents much congestion on the links. The incident is marked as the red area where the incident occurred time (7:00 a.m.), cleared time (7:53 a.m.), and traffic recovered time (8:20 a.m.) are clearly marked. According to the incident data base (SIMS), a multiple vehicle crash incident on 22 July in 2011 was verified on 7:39 a.m. and was cleared on 7:53 a.m., and the tow assistance arrived on 7:47 a.m.. The total duration time was 80 minutes calculated based on real traffic flow data, while the duration time was only 14 minutes according to incident data base. The real duration time was almost 6 times of incident data base, resulting in errors of prediction model.

However, the accuracy of the incident prediction model cannot be improved, which is partly caused by the definition of the incident duration. It is difficult to get real incident duration time as shown in Figure 1, because the incident occurred time and the incident ended time cannot be collected exactly. In this paper, the clearance time and arrival time are investigated individually. The major impact factors for the clearance time and arrival time are studied, and two prediction models are then developed. The comparison of different impact factors is put forward.

The rest of this paper is structured as follows. The introduction of Incident Management System (SIMS) in the Pacific Motorway, Brisbane, Australia, and the incident parameter properties are represented in Section 3. Section 4 describes model development: firstly the hazard based model is introduced; secondly, the best statistic fitting distribution for 3 types of incident is given for both clearance time and arrival time; thirdly, the obvious impact factors of the hazard based model are given separately after data filter; finally, the quantitative impact evaluation and precision are obtained. The conclusions are summarized in Section 5.

2. Literature Review

Over the past several decades, various models have been developed to predict traffic incidents. Most of these studies are based on statistics theory. These representative prediction models can be classified into the following categories: probabilistic distribution, linear regression analyses, time sequential methods, decision tree models, neural network models, Bayesian classifier, hazard analysis model, and so forth.

Golob et al.  provided the first probabilistic distribution of incident duration. They found that each part of duration time was related to the previous duration time and demonstrated that the duration fits a lognormal distribution. Giuliano  extended Golob’s work by applying a lognormal distribution among similar type of incidents and reported that whether involving truck was the key impact factors of incident duration. Garib et al.  and Sullivan  did similar research. Jones et al.  developed double logarithmic distribution of incident duration time. Nam and Mannering  found that the Weibull distribution can also be used to describe incident data.

Linear regression models have been widely used due to their simplicity and effectiveness. The relationship between impact factors and incident duration can be obviously reflected in regression models [5, 9]. Ozbay and Kachroo  developed incident clearance time based on regression method using 121 traffic incident data collected from Chicago, USA. Nine statistically significant variables were put forward, but the accuracy and effectiveness of the model were not given. Kau  estimated freeway incident duration using multiple linear regression method, with confidence interval of 95%. He defined the incident clearance time as the duration from the time that a police vehicle or freeway service patrol vehicle arrives at the scene until the vehicles are cleared from the scene.

The advantage of time sequential model is that it can do regression prediction using a few incident property variables at the early stage of the incident and can update the prediction result when more information is collected. Khattak et al.  found that incident type and severity were the most significant factors. He applied time sequential model based on small incident sample, but the practicability had not been demonstrated because of poor accuracy and little incident data.

The decision tree model is independent of distribution of dependent variables. Sethi et al.  indicated that the average duration time was 21 minutes based on 801 incident data, including data set of traffic interruption, disable vehicle, and severity. The results showed that the incident type was the most obvious factor on duration. Ji et al.  set up the decision tree model based on Bayesian using 1853 incident data in Utrecht, the Netherlands. The precision was improved to 73.39% compared to the decision model. Stephen et al.  developed an incident duration model based on a naïve Bayesian classifier. He emphasized that incident duration was a highly variable quantity and although the model performed better than a linear regression, its classification was still correct only in half of the time.

Artificial neural networks (ANN) have been widely used for prediction and pattern classification problems. Lopes et al.  presented an adaptive model to forecast the clearance time of real time traffic incidents. The solutions included four models which were calibrated and tested by incident records from Portuguese highways. The performance showed that it was able to estimate 72% of incident with less than 10 minutes error and about 92% with less than 20 minutes error. Some other examples can be found in [17, 18].

The hazard analysis model has been used in traffic engineering, which is a common topic in many fields such as life sciences, biomedical, and reliability engineering. The model is more effective to analyze time-related problem, which is generally used to describe the analysis of data in the form of time from a well-defined time origin until the occurrence of some particular event of an end point . Examples include the time between incident occurrence and its clearance [8, 20], the time between planning and execution of an activity , and the analysis of urban travel time .

Furthermore, hazard-based model has been used to model incident duration time. Chung  presented an accident duration model using 2-year-accident dataset from 2006 to 2007 in Korean freeway systems, and the Log-logistic distribution was selected for accelerated failure time metric model. Although the model had large prediction error, statistical test results indicated that this model was stable over time. Tavassoli et al.  developed parametric accelerated failure time survival models of incident duration. They found that the duration of each type of incident is uniquely different and responds to different factors.

One distinctive feature of hazard based model is that the model precision will be improved if the best fitting distribution of time variable is chosen. In this study, hazard based models, in particular the accelerated failure time (AFT) metric, are utilized to model both incident clearance time and arrival time.

3. Description of Incident Base

Incident data was collected by Queensland Department of Transport and Main Roads’ STREAMS Incident Management System (SIMS) for South East Queensland urban networks from November 2009 to November 2010. SIMS is an incident management system which is used throughout Queensland, Australia, to capture traffic incidents which cause an impact on traffic flow on the road network. There are total 35103 incident data for one year, which can be classified into 9 types: alert, congestion, crash, fault, flood, hazard, planned incident, road works, and stationary vehicles. There are many detailed properties in SIMS incident data base, but not all of them are closely related to incident time prediction, such as location, SIMS ID, and status. Hence, the major properties of each incident data are shown in Table 1. However, not all these properties are recorded for each incident occurrence. For example, the parameters are only applicable for crash data which are “number of vehicles involved,” “number of people injured,” and “number of fatalities”.

Incident property information.

Parameter Description information
Priority
Classification The classification to further define the type of incidents
Blockage type The type of blockage caused by the traffic incident. Possible values are unknown, no blockage, partially blocked, blocked, both direction blocked
Number of lanes blocked The number of lanes blocked by the traffic incident
Fire hazard If hazard resulted in smoke/poor/visibility and/or fire
Intersection Whether nearby the intersection
Major incident Whether the incident is major
Lane blocked How many lanes are blocked
Heavy traffic Whether heavy traffic involved
Towing Whether towing is needed
Number of vehicles involved The number of vehicles involved in the crash.(only applicable for crash incident data)
Number of people injured The number of people injured in the crash.(only applicable for crash incident data)
Number of fatalities The number fatalities from the crash
Medical emergency Yes/no
Weather Possible values are cloudy, electrical storm, fine, fog, hail, heavy rain, showers, snow, unknown, wind. This column will always display “Unknown” for weather hazard incident.
Opp lanes blocked The number of lanes blocked by the traffic incident, on the opposite link.
Diversion required Indicates whether a diversion was required due to the traffic incident
Lateral position Possible values are in bay, in lanes, in median, left shoulder, right shoulder, and unknown.
Chemical spill This is only applicable to crash incidents and debris/obstruction/spill.

Only 3 types of incidents: crash, stationary vehicle, and hazard are used to model development though 9 types of incident recorded in SIMS data base. Other incident type data are rare recorded. Consequently, the clearance time and arrival time prediction model are only developed for 3 types of incident.

4. Model Development

Hazard based time models were originally used for problems in biomedical, engineering, and social sciences, which are a class of statistical methods for studying the occurrence and timing of events. Recently, they were used to model time related issues in transportation. A review of the application of the hazard based duration models in transportation up to the early 1990s .

The incident time in hazard based model is a realization of a continuous random variable T, with a cumulative distribution function F(t), which is called the failure function. A probability density function f(t), survival function S(t), and hazard function h(t) are given as (2)–(4). The relationships between these four functions are formulated in (1)–(4), and P(·) means probability. The function of a random variable T is given by (1)F(t)=0tf(u)du=P(T<t),(2)f(t)=dF(t)dt=limΔt0P(tT<t+Δt)Δt,(3)S(t)=P(Tt)=1-F(t),(4)h(t)=f(t)1-F(t)=f(t)S(t)=limΔt0P(t+ΔtTt)Δt. In (4), with fully parametric models, three distributional alternatives were considered, namely: Gamma, Log-logistic, and Weibull, for the hazard function and are tested to find the best fit to the incident clearance time and arrival time. The functional forms of the hazard function for each model can be derived by using each distribution model and general function.

4.1. Gamma Distribution

The Gamma distribution is briefly described as a two-parameter family of continuous probability distributions. The scale parameter is λ and the shape parameter is k~, where k~>0 and λ>0. The Gamma function is mathematically defined as  (5)Γ(k~)=0tk~-1e-tdt. After algebra transform, the p.d.f. (probability density function) of the Gamma distribution, generally written as f[t;T~Γ(λ,k~)], is given by (6)f[t;T~Γ(λ,k~)]=λ(λt)k~-1e-λtΓ(k~),t>0. When k~=1, the Gamma density function reduces to the exponential density function, and the exponential distribution is also a special case of the Gamma distribution.

When λ=1, (5) reduces to the one-parameter Gamma distribution, also referred to as the standard Gamma distribution of T, written as (7)f[t;T~Γ(k~)]=tk~-1e-tΓ(k~),t>0 with its c.d.f. (cumulative distribution function) defined as (8)F[t;T~Γ(k~)]=0ttk~-1e-tΓ(k~),t>0. Specification of the survival and hazard functions for the Gamma distribution are based on (8), which is called the incomplete Gamma function. The survival function S(t) is given by the following equation: (9)S[t;T~Γ(k~)]=1-F[t;T~Γ(k~)]=1-0ttk~-1e-tΓ(k~). The Gamma distribution hazard function can be expressed as (10)h[t;T~Γ(k~)]=f[t;T~Γ(k~)]S[t;T~Γ(k~)]=λ(λt)k~-1e-λtΓ(k~){1-F[t;T~Γ(k~)]}.

4.2. Weibull Distribution

The Weibull distribution model is almost the most widely applied parametric function in survival analysis because of its flexibility and simplicity among all the families of parametric time distributions .

The Weibull probability of event time T, a continuous function, is featured by the use of two parameters: a scale parameter  λ and a shape parameter p~. The survival function with the Weibull distribution is given by (11)S(t;λ,p~)=exp[-(λt)p~],t>0. Given the intimate associations among various lifetime measures, the hazard function in the Weibull distribution can be readily derived from the above equation.

Consider (12)h(t;λ,p~)=-(d/dt)e-(λt)p~e-(λt)p~=λp~tp~-1exp[-(λt)p~]exp[-(λt)p~]=λp~(λt)p~-1. The cumulative hazard function H(t) can be expressed in terms of S(t), given by (13)H(t)=-logS(t)=-log{exp[-0th(u)du]}. Therefore, the cumulative hazard function H(t;λ,p~) can be written as (14)H(t;λ,p~)=-logS(t;λ,p~)=-log{exp[-(λt)p~]}=(λt)p~. Taking natural Log values on both sides of (14), (14) can be written as (15)log[-logS(t;λ,p~)]=logλ+p~logt.

Specifications of S(t) and h(t) lead to the following equation for the Weibull p.d.f. function: (16)f(t;λ,p~)=h(t)S(t)=λp~(λt)p~-1exp[-(λt)p~]. Likewise, the c.d.f. at time t is derived by (17)F(t;λ,p~)=1-S(t;λ,p~)=1-exp[-(λt)p~]. Given λ*=1/λ, the Weibull hazard function can be reexpressed as (18)h(t;λ*,p~)=p~λ*(tλ*)p~-1.

4.3. Log-Logistic Distribution

The lognormal distribution is widely used to describe events whose rate increases initially and decreases consistently afterwards. The Log-logistic distribution of T is the antilogarithm of the familiar logistic distribution. Let Y=logT. The density function of Y is defined as the familiar logistic distribution : (19)f(y)=b^-1exp[(y-μ~)/b^]{1+exp[(y-μ~)/b^]}2,y(-,), where μ~ and b^ are parameters for the logistic function of Y, described as Y~Logist(μ~,b^). Let λ=exp(μ~) and p^=b^-1. The antilogarithm of (19) is the density function of T: (20)f(t)=(p^/λ)(t/λ)p^-1[1+(t/λ)p^]2,t>0, where λ and p^ are parameters of the Log-logistic distribution, written as T~LLogist(λ,p^). The c.d.f. of T is then given as (21)F[t;T~LLogist(λ,p^)]=11+(t/λ)-p^. Therefore, the survival and hazard rate functions of T can then be readily derived as follows: (22)S[t;T~LLogist(λ,p^)]=1-F[t;T~LLogist(λ,p^)]=11+(t/λ)p^h[t;T~LLogist(λ,p^)]=f[t;T~LLogist(λ,p^)]S[t;T~LLogist(λ,p^)]=(p^/λ)(t/λ)p^-11+(t/λ)p^.

5. Model Result 5.1. The Fitness of Distribution

Understanding of incident characteristics and patterns is essential to establish an appropriate prediction model; therefore, the statistical analysis is carried out firstly. There are 4966 crash records, 15791 stationary vehicle data, and 3847 hazard records for clearance time which are used to do distribution fitting analysis. Four probability density functions, which are Gamma, Log-logistic, Weibull, and lognormal, are fitted to the clearance time for crash, stationary vehicle, and hazard incidents, respectively, (see Figures 3(a), 3(b), and 3(c)). Thick full lines indicate the best fitness distribution. The figures indicate that each incident classification has its respective best fitness distribution function. Four parameters estimates of clearance time probability density distribution: Log likelihood, domain mean, and variance are listed in Table 2. Log likelihood and variance statistics indicate the goodness of fit distribution.

Parameters estimates of clearance time probability density distribution fitness for each incident type.

Incident type Parameters Gamma Log-logistic Weibull Lognormal
Clearance time for crash Log likelihood −23032.7 −23007.3 −23128.9 −23202.9
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 0 < y < Inf
Mean 40.3562 45.586 40.5906 42.8821
Variance 967.94 11380.3 1020.94 2143.83

Clearance time for stationary vehicle Log likelihood −71707.57 −70991.8 −71724.7 −71306.9
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 0 < y < Inf
Mean 33.2276 39.749 33.189 35.3951
Variance 1032.58 975.23 1142.8 2897.44

Clearance time for hazard Log likelihood −20244.6 −20437.6 −20245.9 −20527.5
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 0 < y < Inf
Mean 68.5289 103.329 68.5441 83.078
Variance 4382.43 4478.4 4357.8 22634.2

Probability density distribution of clearance time for crash (a), stationary vehicle (b), and hazard (c).

The less the variance is, the better the distribution fitting will be. For example the Gamma distribution variance for crash clearance time is 967.94, which is the least one comparing other distributions. It is clearly shown that Gamma distribution is best fit for crash clearance time, Log-logistic for stationary vehicle, and Weibull for the hazard.

There are totally 4569 crashes, 14665 stationary vehicle data, and 3382 hazard records for arrival time which are used in this distribution fitting. The number of arrival time record is less than the counterpart for clearance time, because there exists an abundant of invalid arrival time data records in SIMS. All the invalid and defective data are filtered. Figures 4(a), 4(b), and 4(c) represent the probability density distributions of arrival time for crash, stationary vehicle, and hazard separately. Table 3 lists the parameters estimates of arrival time probability density distribution for each incident type. Both the estimate parameters and the figure indicate that the Gamma distribution is best fit for crash arrival time, Log-logistic for stationary vehicle, and Weibull for the hazard.

Parameters estimates of arrival time probability density distribution fitness for each incident type.

Incident type Parameters Gamma Log-logistic Weibull Lognormal
Arrival time for crash Log likelihood −18182.7 −18211 −18215.4 −18207.7
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 0 < y < Inf
Mean 19.9232 25.3389 19.979 21.4866
Variance 323.564 Inf 343.475 885.809

Arrival time for stationary vehicle Log likelihood −58947.1 −58415.1 −58769.5 −58097.4
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 1 < y < Inf
Mean 21.8141 32.3138 21.5822 23.2076
Variance 559.408 Inf 628.02 1895.03

Arrival time for hazard Log likelihood −17132.8 −17269.8 −17127.8 −17250
Domain 0 < y < Inf 0 < y < Inf 0 < y < Inf 0 < y < Inf
Mean 58.4637 102.204 58.4066 70.5117
Variance 3728.08 Inf 3914.51 20842.2

Probability density distribution of arrival time for crash (a), stationary vehicle (b), and hazard (c).

In summary, the clearance time and arrival time for the same incident classification follow the same probability distribution, but with different estimated parameters. Different incident classification has different probability distribution characteristics for both clearance time and arrival time. The best fitting model was selected for each incident type based on the results above to develop hazard based prediction model.

5.2. Hazard Based Model for Crash Clearance Time and Arrival Time

The crash clearance time and arrival time are developed based on the Gamma distribution survival model. Data filter is carried out before model development in order to improve model precision. For example, the clearance times longer than 3 hours, accounting for less than 5% of the total crash dataset, are filtered as shown in the green bars in Figure 5 in order to improve the model accuracy.

Distribution of crash clearance time.

Tables 4 and 5 list the parameter estimate results for model estimated for crash incident clearance time and arrival time. A positive sign of an estimate coefficient suggests an increase in the incident clearance time and a decrease in hazard function associated with an increase in that property variable.

Estimation results of survival AFT model for clearance time for crash.

Variable Estimate coefficient Wald Chi-square Pr > ChiSq
Intercept 4.7907 8.74 0.0031
Priority −0.1495 27.7239 <0.0001
Intersection 0.1737 5.3656 0.0205
Blockage type 0.1878 13.7469 0.0081
Weather 0.1103 15.1529 0.0341
Total opp lanes block 0.0753 11.5657 0.0412
Traffic disrupted 0.0628 4.6412 0.0312
Heavy traffic 0.1356 10.4797 0.0012
Major incident 0.2158 17.7047 <0.0001
Diversion required 0.2775 33.0936 <0.0001
Towing required −0.1518 49.2159 <0.0001
Number of vehicles involved 0.2862 70.2871 <0.0001
Number of people injured 0.7411 26.1993 0.0101
Number of fatalities 1.0412 30.9592 <0.0001
Chemical spill 0.3045 17.4566 <0.0001
Log likelihood −5509.6196

Estimation results of survival AFT model for arrival time for crash.

Variable Estimate coefficient Wald Chi-square Pr > ChiSq
Intercept −0.5248 0.06 0.81
Intersection 0.246 3.6168 0.0472
Lateral position 0.2479 14.2799 0.0039
Total lanes blocked 0.3002 10.6649 0.0493
Traffic disrupted 0.0732 2.8418 0.0218
Towing required 0.2474 48.519 <0.0001
Number of vehicles involved 0.617 57.2253 <0.0001
Number of people injured 0.4512 18.3643 0.0351
Number of fatalities 0.5321 13.7241 0.0082
Log likelihood −6416.712

All variables are statistically significant at a 95% confidence level. Therefore, all 14 significant property variables for crash clearance time are listed in Table 4. However, there are only 8 significant variables for arrival time, which is obviously less than that for clearance time. For example, priority, blockage type, weather, and so forth have a significant effect on the clearance time but not on the arrival time, because of different incident time characteristics. Another reason is that the information recorded in the SIMS for arrival time is less than clearance time.

5.3. Influence of Property Parameters on the Prediction Time

Table 6 represents the percentage change in clearance time and arrival time for crash and hazard incident. A negative percentage indicates a decrease in the clearance time with an increase in that property variable. Line “—” means that the variable has no significant effect on the incident time. For instance, when priority is increased by one, the crash clearance time is 16.13% shorter, and the hazard clearance time is 7.03% shorter, but no significant influence on arrival time. As the number of injured people is increased by one, the crash clearance time is 109.83% longer, while crash arrival time is 57.02% longer, but no significant influence on hazard clearance time and arrival time.

Percentage change in clearance time and arrival time for crash and hazard.

Variable Clearance time for crash Clearance time for hazard Arrival time for crash Arrival time for hazard
Priority −16.13% −7.03%
Intersection 18.97% 73.13% 27.89% 71.77%
Lateral position 25.54% 28.13%
Blockage type 20.65% 47.38% 106.49%
Weather 11.66% 11.17% 6.56%
Total lanes blocked 35.01%
Total opp lanes blocked 7.82% 33.99%
Traffic disrupted 6.48% 7.59%
Heavy traffic 14.52%
Major incident 24.08% 81.56%
Diversion required 31.98% 130.23%
Towing required −16.40% 28.07%
Number of vehicles involved 33.14% 85.34% 377.41%
Number of people injured 109.83% 57.02%
Number of fatalities 183.26% 70.25%
Chemical spill 35.59%

The evaluation of the prediction accuracy for crash incident clearance time is given in Table 7 as an example. There are 4966 crash clearance time data which are used to set up hazard based model. 30% of them are used to evaluate the prediction accuracy. Results in Table 7 indicate the absolute value of the difference between prediction clearance time and measured clearance time. For example there are 543 incidents whose difference between prediction and measured time is less than 10 minutes, which account for 39.68% of the total evaluation incident data. The accuracy of the model is similar with that of Chung .

Summary of evaluation of the prediction accuracy.

Performance measure Value Percentage
<10 min 543 incidents 39.7%
<15 min 768 incidents 56.1%
<20 min 963 incidents 70.4%
<30 min 1158 incidents 84.6%
<50 min 1280 incidents 93.5%
6. Conclusions

Hazard based prediction model for both incident clearance time and arrival time are developed. Three types of incidents are investigated based on data collected from SIMS. Before model development, the best fitting model was selected for each incident type based on the results of the likelihood ratio and variance. The results show the following.

Clearance time and arrival time follow the same distribution with the different estimated parameters for each incident type.

Gamma, Log-logistic, and Weibull distribution are best fit for crash, stationary vehicle, and hazard incident, respectively. After data filter, the hazard based prediction model is developed for crash incident as example.

There are 14 significant incident property variables for clearance time, while there are only 8 significant variables for arrival time. It clearly indicates that clearance time and arrival time have different impact factors.

The percentage changes in clearance time and arrival time for crash and hazard incident are given. The impact of each property variable on clearance time and arrival time is quantitatively provided. The model accuracy is given at the end of paper.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This study has been substantially supported by the National Natural Science Foundation Council of China (no. 71201103). The first author finished the work during the postdoctoral research in Tongji University, and the support of a Project of Shanghai Shuguang Program (13SG23) is acknowledged. The first author thanks the support of the Smart Transport Research Center at the Queensland University of Technology by providing real traffic incident data.

Chou C.-S. Understanding the impact of incidents and incident management programs on freeway mobility and safety [Dissertation] 2010 College Park, Md, USA University of Maryland Tsubota T. Kikuchi H. Uchiumi K. Warita H. Kurauchi F. Benefit of accident reduction considering the improvement of travel time reliability International Journal of Intelligent Transportation Systems Research 2011 9 2 64 70 2-s2.0-79955653734 10.1007/s13177-011-0027-z Golob T. F. Recker W. W. Leonard J. D. An analysis of the severity and incident duration of truck-involved freeway accidents Accident Analysis & Prevention 1987 19 5 375 395 2-s2.0-0023434259 Giuliano G. Incident characteristics, frequency, and duration on a high volume urban freeway Transportation Research A: General 1989 23 5 387 396 2-s2.0-0024855829 Garib A. Radwan A. E. Deek H. A. Estimating magnitude and duration of incident delays Journal of Transportation Engineering 1997 123 6 459 466 2-s2.0-1842729655 Sullivan E. C. New model for predicting freeway incidents and incident delays Journal of Transportation Engineering 1997 123 4 267 275 2-s2.0-0001519535 Jones B. Janssen L. Mannering F. Analysis of the frequency and duration of freeway accidents in Seattle Accident Analysis & Prevention 1991 23 4 239 255 2-s2.0-0026209335 10.1016/0001-4575(91)90003-N Nam D. Mannering F. An exploratory hazard-based analysis of highway incident duration Transportation Research A: Policy and Practice 2000 34 2 85 102 2-s2.0-0033622790 10.1016/S0965-8564(98)00065-2 Cohen S. Nouveliere C. Modelling incident duration on an Urban expressway Proceedings of the 8th International Federation of Automatic Control Symposium on Transportation Systems (IFAC '97) 1997 Chania, Greece Ozbay K. Kachroo P. Incident Management in Intelligent Transportation Systems 1999 Boston, Mass, USA Artech House Kau V. H. L. Estimating Freeway Incident Clearance Duration Using Multiple Linear Regression 2007 Arlington, Tex, USA The University of Texas at Arlington Khattak A. J. Schofer J. L. Wang M. H. A simple time sequential procedure for predicting freeway incident duration Journal of Intelligent Transportation Systems 1995 2 2 113 138 Sethi V. Koppelman F. S. Flannery C. P. Bhandari N. Schofer J. L. Duration and travel time impacts of incidents-advance project 1994 TRF-ID-202 Evanston, Ill, USA Northwestern University Ji Y. B. B. Zhang X. N. Sun L. J. Traffic incident duration prediction grounded on Bayesian decision method-based tree algorithm Journal of Tongji University: Natural Science 2008 36 3 319 324 2-s2.0-43249091148 Stephen B. Fajardo D. Waller S. T. A naïve bayesian classifier for incident duration prediction Proceedings of the 86th TRB Annual Meeting Compendium of Papers 2007 Washington, DC, USA Lopes J. Bento J. Pereira F. C. Akiva M. B. Dynamic forecast of incident clearance time using adaptive artificial neural network models Proceedings of the 92nd TRB Annual Meeting Compendium of Papers 2013 Washington, DC, USA Wei C.-H. Lee Y. Sequential forecast of incident duration using Artificial Neural Network models Accident Analysis & Prevention 2007 39 5 944 954 2-s2.0-34548504071 10.1016/j.aap.2006.12.017 Gaetano V. Lelli M. Cucina D. A comparative study of models for the incident duration prediction European Transport Research Review 2010 2 2 103 111 2-s2.0-77953021904 10.1007/s12544-010-0031-4 Collett D. Modelling Survival Data in Medical Research 2003 2nd Boca Raton, Fla, USA Chapman & Hall/CRC MR1999899 Stathopoulos A. Karlaftis M. G. Modeling duration of urban traffic congestion Journal of Transportation Engineering 2002 128 6 587 590 2-s2.0-0036848570 10.1061/(ASCE)0733-947X(2002)128:6(587) Bhat C. R. Pinjar A. R. Hensher D. A. Button K. J. Duration modeling Handbook of Transport Modeling 2008 2nd Amsterdam, The Netherlands Elsevier Anastasopoulos P. C. Haddock J. E. Karlaftis M. G. Mannering F. L. An analysis of urban travel times: a random parameters hazard-based approach Proceedings of the 91st TRB Annual Meeting Compendium of Papers 2012 Washington, DC, USA Chung Y. Development of an accident duration prediction model on the Korean Freeway Systems Accident Analysis & Prevention 2010 42 1 282 289 2-s2.0-71949115970 10.1016/j.aap.2009.08.005 Tavassoli H. A. Ferreira L. Washington S. Charles P. Hazard based models for freeway traffic incident duration Accident Analysis & Prevention 2013 52 171 181 10.1016/j.aap.2012.12.037 Hensher D. A. Mannering F. L. Hazard-based duration models and their application to transport analysis Transport Reviews 1994 14 1 63 82 2-s2.0-0027996303 Xian L. Survival Analysis: Models and Applications 2012 Beijing, China Higher Education Press