A Data-Driven Approach to Estimate Incident-Induced Delays Using Incomplete Probe Vehicle Data: Application to Safety Service Patrol Program Evaluation

. Tis paper presents a data-driven approach to estimate incident-induced delays (IIDs) using probe vehicle data while accounting for missing data. Te proposed approach is applied to evaluate the efectiveness of a safety service patrol (SSP) program. Existing data-driven methods for IID estimation usually rely on complete data sets. Te proposed approach employs a random forest-based classifcation model and an interpolation method to estimate IIDs when real-time data are completely or partially missing during the incident-impacted time period. It also identifes reference profles from the closest spatial-temporal road segments to improve data availability. Te case study shows that the SSP program in the Quad Cities area of Iowa reduces IIDs associated with various incidents by 15%–91%. Tis data-driven evaluation framework can be applied to other trafc incident management programs, allowing more accurate and objective evaluations of their efectiveness.


Introduction
Incidents, such as collisions and stalled vehicles, can signifcantly afect the trafc fow, causing signifcant travel delays, especially during peak hours and in urban areas with high trafc volumes. Tese disruptions are stochastic in nature as they occur infrequently and are not part of the expected trafc fow patterns. Te delays caused by incidents can lead to secondary crashes, additional fuel consumption, and increased air pollution. Terefore, rapid incident detection, response, and cleanup are crucial to mitigate the negative impact of incidents. Consequently, transportation agencies have implemented various trafc incident management programs to better manage road incidents [1,2].
One of the key performance measures to evaluate the efectiveness of trafc incident management programs is the reduction in incident-induced delay (IID). IID is the additional travel time caused by incidents and is a means of quantifying the impact of incidents on the trafc fow. Terefore, accurately quantifying IIDs is critical to evaluating trafc incident management programs. Various methods have been proposed to estimate IID in the literature, including the deterministic queueing theory [3][4][5][6], simulation [7,8], and statistical analysis [1,[9][10][11][12][13][14]. Te method of the deterministic queueing theory estimates the IID by assuming constant trafc demand and reduced capacity caused by the incident. Tis method is widely used in the literature because of its simplicity and minimal data requirements. However, the deterministic queueing model does not account for trafc dynamics [3][4][5][6]. Simulationbased methods include macroscopic simulation [7] and microscopic simulation runs [8] with and without incident to estimate IID. Developing and calibrating simulation models are usually costly and time consuming [9]. To address the shortcomings of the methods mentioned above, data-driven approaches have been proposed using traveltime data [1,[9][10][11][12][13][14]. Data-driven IID calculation methods use trafc data from various sources to estimate the impact of incidents on the trafc fow. For example, Habtemichael et al. calculated the IID based on travel time diferences between an incident profle and a reference profle representing the normal trafc condition [1]. Te core of this method is to identify the reference profles for each incident. When feld data are available and accurate, this method provides reliable IID estimates. Furthermore, Park et al. quantifed the impact of nonrecurring congestion using probe vehicle data. Teir approach captures the dynamics of trafc evolution and can detect the incident-impacted area accurately using real-time speed data collected from probe vehicles [14]. Tese data-driven approaches provide a more accurate and objective assessment of IID, which can help transportation agencies better understand the impact of incidents and make informed decisions to minimize delays. In addition, the IID calculation can also be used to assess the performance of trafc incident management systems and identify areas for improvement. However, existing datadriven approaches rely on real-time trafc data, which can be partially or entirely missing at the time of an incident, making the data-driven approach impossible to determine the IID.
Consequently, this paper proposed a data-driven approach to estimate incident-induced delays using incomplete data from the probe vehicle. Te proposed method avoids the following weaknesses identifed in the literature. First, the typical IID calculation is based on a reference profle from the same location on a diferent day, which requires continuous feld data collection during the incident duration and the reference period. However, due to the frequent missing data, an accurate IID calculation is problematic for some incidents. Terefore, this study proposed a methodology to fnd a reference profle of the road segments that are the most proximate in terms of spatialtemporal range to the incident but not afected by the incident. Te method can mitigate the impact of missing data by fnding a reference profle of road segments with similar trafc patterns. Second, this study accounts for short and long periods of missing data. In particular, simple interpolation is used to fll in missing data for a short period. For incidents with a long period of missing data, the IID occurrence classifcation model is developed using incident information and applied to incidents without real-time speed data. Te proposed data-driven IID estimation method is applied to evaluate the benefts of implementing a safety service patrol (SSP) program in the Quad Cities area of Iowa. Te new evaluation framework addresses the defciencies in existing evaluation methods and provides a reliable IID estimation. Existing studies focused on fnding the reference profle and calculating IIDs but usually ignored missing data issues. Terefore, this study proposes an integrated IID estimation procedure that accounts for missing data obtained from probe vehicles. Te data-driven IID savings estimation approach is applicable to evaluating other incident management programs, such as removal laws and dispatch collection.

Data Description
In this study, three types of data are used for the calculation, estimation, and evaluation of the IID of the SSP program, namely, speed data from probe vehicles, estimated trafc volume from annual average daily trafc (AADT), and incident data from advanced trafc management system (ATMS) event logs. It is possible to use other data sources for analysis as long as speed (or travel time), volume, and incident information are collected.

Speed Data.
Te speed data used in this study were collected by INRIX, a real-time trafc information platform that provides trafc speed and travel time data. INRIX is a crowd-sourced trafc data set that uses connected vehicles and smartphones to collect real-time trafc data. In addition, INRIX provides historical speed data derived from multiple sources, including GPS probes and physical sensors. Te GPS probe vehicles include trucks, taxis, buses, and passenger cars equipped with onboard GPS devices and transmitting capability. Te data set includes travel time and average speed in each segment of the road with a data collection frequency of one minute, as well as the following confdence score: 10 (historical), 20 (combination of real and historical), and 30 (real). In this study, only real-time speed data (i.e., data with a confdence score of 30) are used to calculate IID because when an incident occurred, the trafc condition is likely diferent from the normal condition represented in the historical data. However, there were cases where real-time speed data were missing due to communication failures or the probe vehicle was not traveling through the incident location at that time [15].

Trafc Volume.
Trafc volume is needed to estimate the number of vehicles impacted by the incident. To provide an accurate IID calculation, volume data collected on impacted roadway segments during the incident are preferred. However, due to the limited coverage of roadway sensors and the random occurrence of incidents in the road network, continuous trafc counts were not available in many incident locations. Terefore, in this study, the adjusted hourly trafc volume based on AADT is used. AADT is collected from the Iowa DOT roadway asset management system (RAMS). Te adjusted trafc volume applies the hourly factors based on the month, day of week, and time of day, as shown in the following equation [16]: modified traffic flow (vph) � AADT × monthly factor × hourly factor. (1)

Methodology
As illustrated in Figure 1, the IID estimation framework consists of four modules, namely, data preprocessing, incident classifcation, IID calculation, and IID estimation. Te data preprocessing module includes data cleaning and spatial and temporal alignment of speed, trafc volume, and incident data. Te incident classifcation module classifes incidents based on the availability of real-time speed data. Te IID calculation module computes incident-induced delays based on the travel time diference under normal trafc conditions and under incident conditions, provided that real-time speed data are available or can be interpolated. Lastly, the IID occurrence classifcation model uses the random forest method to estimate the occurrence of IID in the cases where real-time speed data are missing.

Spatial and Temporal
Alignments. Te INRIX speed data and the RAMS trafc volume data use diferent segmentation systems. INRIX utilizes the extreme defnition (XD) segmentation system, which includes functional road class (FRC) 1 (i.e., highways and major intersections)-3 (i.e., major road), and usually breaks at intersections and interchanges [17]. Te RAMS segmentation system comprises FRCS 1-4 (i.e., neighborhood streets) and has diferent breakpoints from the XD segments. RAMS collects trafc, roadway geometrics, pavement condition, and business data associated with public roads in the state [18]. In addition, incident locations are recorded by coordinates, road name, and direction. Terefore, to calculate the IID, incident information, speed, and trafc volume data were linked using the geographic information system (GIS). In addition, since the trafc impact of an incident could propagate upstream and downstream of the incident location, speed data from fve upstream segments, one downstream segment, and the incident segment were included in the analysis. Te spatial range was determined based on the most severe incident within the scope of this study. Te upstream roadway segments within 3 miles (4.8 km) and the downstream segments within 1 mile (1.6 km) of the incident segment were determined as the maximum range afected by an incident. Furthermore, to account for the latency in the reported incident time and to monitor trafc conditions before and after an incident, the data collection period starts 30 minutes before the reported time and ends 30 minutes after the incident cleared time. In other words, the temporal range was set based on incident reported time and cleared time by adding 30 minutes before and 30 minutes after incident clearance.
In addition, some data cleaning eforts were conducted to prepare the data set for subsequent analysis. For example, cases in which the incident clearance time exceeds one day (i.e., 24 hours) were excluded from the analysis. Tese incidents are mainly stalled vehicles left at the roadside for a long period of time, which usually have minimal impact on delay. Speeds based on historical data or partial real-time data are excluded, as real-time speed data can refect the impact of an incident on trafc conditions. As a result, a combined data set is created that includes speed, volume, and incident information for analysis. Te spatial-temporal aligned data set provides a basis for quantifying incidentinduced delay. Figure 2 shows the trafc impact of an incident in the time-space diagram. Te spatial unit is one roadway segment. Te temporal unit is one minute (i.e., the INRIX data collection frequency). Te incident was a twovehicle crash that occurred near Exit 4 on I-280 (MM 10) in Quad Cities around 8:40 am on December 11, 2019. Te incident was cleared at 12:40 pm. Terefore, the incident clearance time was 240 minutes, including a two-lane blockage for about 2 minutes. Te incident impacted fve roadway segments upstream (approximately 3 miles). It took 255 minutes for the trafc to recover. Tis example also shows that some real-time speed data are missing.

Incident Classifcation Based on the Availability of Real-Time Data.
Based on the availability of real-time speed data, incidents are classifed into three categories: (1) speed data are available throughout the period and on all highway segments, (2) speed data are missing for a short period, and (3) speed data are missing completely or for a long period. In this study, about 63% of the incidents have real-time speed data available for the entire duration. Short period speed data missing is defned as a case where real-time speed data are missing in one or more segments for less than 15 minutes. For a short period of missing data, the speeds were flled using a moving average interpolation method. Te interpolation method calculates the average speed between the previous two time intervals and the next two available time intervals on the same roadway segment. Te 15-minute threshold was determined on the basis of the accuracy of the speed estimation. When the missing data are less than 15 minutes, the mean absolute percentage error (MAPE) of the interpolation method is within 10%, as shown in Figure 3. For cases with long periods of missing data (i.e., exceeding 15 minutes), the average IID calculated from a similar case is used. In the incident data set, 15.5% of the cases have missing speed data for a short period and 21.5% of the data set has missing speed data for a long period.

Calculation of Incident-Induced Delay.
Te IID calculation module determines the delay caused by an incident based on real-time or partially interpolated speed data. Te delay determination threshold was set at 80% of the normal speed, which is defned as the 85 percentile speeds. In addition, the procedure for determining the threshold was derived from the Federal Highway Association (FHWA) method of calculating congested hours [19].
If a delay has occurred due to an incident, the IID of the incident is calculated using equation (2). Te travel time of one cell is calculated based on the length of the road segment and the speed. Te average speed under normal trafc conditions is then used to determine the normal travel time for each segment and the time interval. Te normal trafc condition is found from the road segments that are the most proximate in terms of spatial-temporal ranges to the incident but outside the incident-impacted range. Te proposed IID calculation process accounts for nonrecurrent delays, as it can detect segments with slower travel speeds compared to the normal speed of road segments with recurring delays during peak hours. Te diference between the travel time afected by the incident and the average travel time under normal trafc conditions is considered the IID per vehicle. Finally, the delay in the vehicle by diferent  classes of vehicles is calculated by applying the trafc volume based on AADT.
An example of determining whether IID occurs within an incident impact range is shown in Figure 4. Figure 4(a) shows the speed profle around the location and time of the accident. In this case, the accident afected one segment downstream and up to fve segments upstream. Figure 4(b) shows the occurrence of delays on the speed profle by applying the proposed IID determination method. Based on the normal-condition speed threshold, each cell is classifed as normal or delay occurrence. Te normal speed profle in the analysis area is calculated using the speeds in the cells of normal condition at each time interval. Figure 4(c) shows both the normal speed profle and the incident-impacted speed profle. Te results confrmed that the proposed method can distinguish the spatiotemporal range of delays caused by the accident. Te average travel time calculated in the surrounding area, defned as normal conditions, can efectively refect the trafc characteristics of the corresponding time of day. During peak hours, this approach can calculate the additional delay caused by incidents by comparing it with the normal travel time observed during peak hours.

IID PC and Truck
where i is the segment number (LSN: last delayed segment number). ANN is a machine learning algorithm inspired by the structure and function of the human brain. ANN models are made up of layers of interconnected nodes that perform mathematical operations on input data to generate output predictions. ANNs have been widely used in various felds, including anomaly detection [20], incident detection [21], and incident duration prediction [22]. Te advantages of ANNs include their ability to handle nonlinear relationships in data, their fexibility in modeling complex systems, and their ability to learn from large data sets. However, ANNs can be computationally expensive and require a large amount of data to train [20][21][22]. SVC is a supervised learning algorithm that is commonly used in classifcation tasks. SVC fnds a hyperplane that optimally separates the diferent classes of data points in a high-dimensional space. SVC has been used in applications such as classifcation of crash severity [23,24] and detection of transport modes [25]. Te strengths of SVC include its ability to handle high-dimensional data, its efectiveness in dealing with nonlinearly separable data, and its relatively low computational cost. However, SVC can be sensitive to the choice of kernel function and hyperparameters [26].
NB is a probabilistic classifer based on the Bayes theorem. By assuming conditional independence among different features, the NB calculates the probability of belonging to each class, given the input features. NB has been used in applications such as text categorization [27], trafc risk management [28], and incident detection [29]. Te advantages of NB include simplicity, the ability to handle high-dimensional data, and the fast computational time for training and prediction. However, NB can be sensitive to the assumption of independence among features, which may not hold in some data sets [20,28].
KNN is an instance-based learning algorithm commonly used for classifcation and regression. KNN fnds the knearest neighbors of a new data point in a feature space and then assigns the data point to the class that is the most common among its k-nearest neighbors. KNN has been used in applications such as vehicle classifcation [30], incident classifcation [31], and anomaly detection [32]. Te advantages of KNN include simplicity, ability to handle nonlinear relationships in data, and dealing with noisy data. However, KNN can be sensitive to the choice of distance metric and the number of classes [32].
Lastly, RF is a model generated by gathering many decision trees and is a technique for separating data based on specifc features. Using the principle of majority rule, the most frequent value among the prediction values made by several decision trees is the fnal prediction value, called the ensemble. Te advantages of RF are threefold: frst, the RF consists of multiple decision trees, which can inherently manage missing values without requiring extensive preprocessing. Second, each decision tree in the forest is trained independently on a random subset of data. Tis parallelization leads to a reduced training time, especially when dealing with large data sets. Tird, RF can reduce the risk of overftting by averaging the output of multiple decision trees [33,34].
Four metrics, namely, precision, recall, F1 score, and accuracy, were compared across the classifcation models mentioned above. Te precision metric is the proportion of what the classifcation model classifes as true to actually be true. Recall is the proportion of what the model predicts as true out of what is actually true. Precision and recall are complementary to each other. Higher values of both metrics indicate a better model. Te F1 score is the harmonic mean of precision and recall [35]. In addition, the false alarm rate (FAR), the detection rate (DR), and the overall accuracy of the model (classifcation rate, CR) are used to obtain the performance of the model. FAR is the ratio of false negative cases among the number of cases without delays. DR is the accuracy to detect IID occurrences between IID cases, which is the same as the recall of with delay occurrence cases. Lastly, CR describes the proportion of correctly classifed cases out of the total number of cases evaluated using the established classifcation for IID occurrences and is also used for model selection [36].
For incidents with adequate real-time speed data, the IID calculation method is applied, as discussed in Section 3.3. Each incident is classifed as a "delay" or a "no delay." A total of 5,217 incidents are included in the data set, of which 2,025 (38.8%) incidents cause additional delay and 3,192 (61.2%) incidents cause no delay. Te data set is divided into training and testing sets, with 3,901 (75%) and 1,316 (25%) incidents in each subset with an equal proportion of cases in which delay occurred and cases in which it did not, respectively. Te training and testing sets have an equal proportion of cases in which a delay occurred and cases in which it did not. Te training set is used to train each classifcation model using supervised learning, with the aim of maximizing the accuracy of each model by comparing various classifcation factors. Te number of hidden layers and the learning rate are adjusted with a maximum of 1,000 iterations to fnd the optimized ANN model. For the selection of the SVC model, four kernels (linear, polynomial, sigmoid, and radial basis function) are considered as the changeable factors. Te KNN model selection process is conducted to fnd the optimal number of neighbors, and the number of estimators for RF to achieve the best performance is determined through iterations. Table 1 compares the performance metrics of diferent classifcation models, and Figure 5 shows the classifcation performance of each model. Based on the data set provided, the RF classifer had the highest CR of 0.758, indicating a relatively high proportion of instances that were correctly classifed. Furthermore, RF had a relatively low FAR of 0.165 and a moderate DR of 0.640, indicating that it was able to minimize the number of false positive predictions while maintaining a reasonable proportion of true positive predictions. However, the ANN and NB classifers showed high DRs of 0.706 and 0.640, respectively, and also high FARs of 0.338 and 0.426, indicating that they may have been too sensitive to positive instances and produced too many false positive predictions. Te SVM and KNN classifers showed the opposite trend, with low FARs but also low DRs.
Two key elements of a random forest classifcation model are the selection of the classifcation features and the number of estimators. First, the importance based on the impurity of each feature among the explanatory variables of the incident data sets was calculated and applied to the classifcation model when the importance was 0.01 or greater. Tis method, called Gini importance, prioritizes features that afect the ability of a classifer. Te method of calculating importance is described by Menze et al. [34]. Ten  True negative (TN), (2) false positive (FP), (3) false negative (FN), and (4) true positive (TP). 8 Journal of Advanced Transportation characteristics were selected and presented in Table 2. Te importance of trafc volume and incident clearance hour was found to be greater than other features. Second, setting an appropriate number of estimators helps increase the accuracy of the RF classifcation model. Terefore, a sensitivity analysis was performed with respect to the number of estimators. Finally, the IID occurrence classifcation model is applied to incidents without real-time speed data to determine whether a delay occurred or not. If a delay occurs, the    average IID associated with the same type of incident is utilized.

Case Study: Evaluation of the Safety Service Patrol Program
Te proposed IID estimation approach is applied to assess the benefts of deploying a safety service patrol program in the Quad Cities area, Iowa. SSP programs have been implemented in many states to reduce incident clearance times and mitigate the impact of incidents on highways [37,38]. Tis area is classifed as municipal interstates in Iowa, so expansion factors and the hourly distribution of daily trafc for municipal interstates were used to calculate the adjusted AADT in each segment [16].

Safety Service Patrol
Program. In Iowa, the Iowa Department of Transportation operates a safety service patrol program, called Highway Helper, in several metropolitan areas. Trough the program, Highway Helper trucks patrol roads, assist vehicles in accidents or inoperable conditions, and remove debris [39]. Te SSP program was introduced to the Quad Cities area in September 2019. Te roads covered by the program include I-74, I-80, I-280, and US 61 (see Figure 6), with patrol services provided from 5 am to 9 pm on weekdays. To assess the benefts of the program, the data set is divided into two subsets, i.e., before and after the SSP is in operation to determine the delay savings. Te period before the SSP program is from 01/01/2019 to 09/08/2019. Since the operation of the SSP program was impacted by COVID-19 from March 16, 2020, the after period is set from 09/09/2019 to 03/15/2020. Te beneft of the SSP program is evaluated based on savings in IID.

IID Comparison.
Te average IIDs for each type of event were calculated and summarized in Table 3. IIDs are averaged over the entire incident data set, before and after the SSP program is deployed. Note that after the SSP period, it was only extended to March 15, 2020, to exclude the impact of COVID-19. In general, the IID occurrence rate of vehicle crash-related incidents was higher than the IID for debris or stalled vehicles. In particular, the average IID for one vehicle crashes was 54.83 veh-h, which is about fve times higher than the average IID for stalled vehicles, that is, 11.15 veh-h. Among all types of incidents, the IID occurrence rate for debris was the lowest at 19.2%, and the average IID was also the minimum at 2.25 veh-h. Based on the before and after comparison, the SSP program signifcantly reduces both the IID occurrence rate and the average IID. Te number of incidents collected during the pre-SSP period was 589, of which 103 cases (i.e., 17%) were with no sufcient real-time speed data. During the period after SSP, a total of 653 incidents were collected with the program. Speed data were insufcient for 164 incidents (i.e., 25%). In addition, two secondary crashes were detected in the data set. Trough the missing-speed data processing approach, 267 incidents (21% of the total) can be incorporated for program-saving quantifcation. When comparing crash-type events before and after the SSP program, it was found that the average IIDs after the SSP program had decreased from a minimum of 21.9% (2 vehicle crashes) to a maximum of 91.1% (earlier crashes) than the average IIDs before the application of the program. It was also confrmed that the delay occurrence rates of incidents had a statistically signifcant diference. Furthermore, although the average decrease in IIDs for the stalled vehicle type in the program application period was only 14.8%, the probability of the occurrence of IIDs was at a level of 60% before the program application period.

SSP Beneft Evaluation.
To estimate the beneft of the SSP program, incidents are classifed into shoulder blocks and lane(s) blocks. Among the 653 incidents that occurred during the "after SSP" period, there are 60 cases of lane(s) blockage and 593 cases of shoulder blockage. In all types of vehicles and blockage, delay savings were ensured with the program's help, and the greatest delay savings beneft was obtained for the shoulder block of passenger cars. Te delay savings calculated for the period after the introduction of the program were converted into annual delay savings, and the delay savings of 20,306 veh h and 5,962 veh-h are for passenger cars and trucks, respectively (Table 4).

Conclusion
Tis paper presents a data-driven approach to estimate incident-induced delays using probe vehicle data, accounting for missing data. When the speed data are missing for a period of less than 15 minutes, it can be reliably interpolated on the basis of the moving average. In the case of a long period of missing data, a classifcation model is developed to estimate the occurrence of a delay and the average IID for each type of incident is used when a delay is expected to occur. Te proposed IID estimation method is applied to evaluate the benefts of a safety service patrol program deployed in the Quad Cities area in Iowa. Te results of the analysis showed that when SSP helped in the incident clearance process, the average IIDs were lower than those of the same types of incidents before the introduction of SSP. Tis shows that the SSP is an efective method for trafc incident management. Te signifcance of this study can be summarized as follows. First, the proposed IID estimation approach takes advantage of real-time trafc data while mitigating the impact of missing data. Trough this data-driven approach, it is possible to measure the performance of the SSP program with greater accuracy. Second, the IID was calculated based on speed data collected in segments around an incident location and in proximity time intervals that are not afected by the incident, which can reduce the likelihood of missing speed data when fnding a reference profle. Tird, the proposed IID estimation approach can be applied to evaluate the performance of other trafc incident management programs, in addition to SSP.
However, there are limitations to the present study. First, this study did not include weather information, which can have a signifcant impact on the occurrence and impact of the incident. If weather information is included, the accuracy of the IID classifcation model may improve. Second, in future research, an IID measurement model can be developed. In this study, the average IID was used according to incident characteristics after determining the occurrence of the IID using the classifcation model. However, a more accurate program beneft analysis would be possible when developing an estimation model based on a larger data set.

Data Availability
Te data used are provided by the 3rd party data provider INRIX.

Conflicts of Interest
Te authors declare that they have no conficts of interest.