^{1}

^{2}

^{1}

^{2}

^{1}

^{2}

^{1}

^{2}

^{1}

^{2}

^{1}

^{2}

In Beijing, Shanghai, Hangzhou, and other cities in China, traffic congestion caused by traffic incidents also accounts for 50% to 75% of the total traffic congestion on expressways. Therefore, it is of great significance to study an accurate and timely automatic traffic incident detection algorithm for ensuring the operation efficiency of expressways and improving the level of road safety. At present, many effective automatic event detection algorithms have been proposed, but the existing algorithms usually take the original traffic flow parameters as input variables, ignoring the construction of feature variable sets and the screening of important feature variables. This paper presents an automatic event detection algorithm based on deep cycle limit learning machine. The traffic flow, speed, and occupancy of downstream urban expressway are extracted as input values of the deep-loop neural network. The initial connection weights and output thresholds of the deep-loop neural network are optimized by using the improved particle swarm optimization (PSO) algorithm for global search. The higher classification accuracy of the extreme learning machine is trained, and the generalization performance of the extreme learning machine is improved. In addition, the extreme learning machine is used as a learning unit for unsupervised learning layer by layer. Finally, the microwave detector data of Tangqiao viaduct in Hangzhou are used to verify the experiment and compared with LSTM, CNN, gradient-enhanced regression tree, SVM, BPNN, and other methods. The results show that the algorithm can transfer low-level features layer by layer to form a more complete feature representation, retaining more original input information. It can save expensive computing resources and reduce the complexity of the model. Moreover, the detection accuracy of the algorithm is high, the detection rate is higher than 98%, and the false alarm rate is lower than 3%. It is better than LSTM, CNN, gradient-enhanced regression tree, and other algorithms. It is suitable for urban expressway traffic incident detection.

China’s road traffic situation is extremely grim. With the rapid development of urban road traffic, the numbers of car ownerships and motor vehicle drivers show a trend of rapid growth, which led road traffic incidents to have maintained a high base and high accident rate for many years. More than 50% of urban road traffic congestion in Shanghai, China, is caused by expressway traffic incidents. The congestion caused by traffic accidents on expressways will reduce the capacity of expressways, affect the operational efficiency of expressways, and seriously cause traffic accidents, threatening people’s lives and property safety. Therefore, it is necessary to study the detection algorithm of expressway traffic incidents, improve the management level of expressway traffic incidents, detect and deal with traffic incidents in time, and reduce the impact of traffic incidents on urban expressway traffic.

In the aspect of traffic incident detection, many experts and scholars have done some research studies, and some achievements have been applied. In [

Extreme learning machine (ELM) is a new learning algorithm for single-layer feedforward neural networks. ELM randomly selects input weights without adjusting them and calculates output weights through Moore-Penrose generalized inverse matrix. ELM has faster training speed and stronger generalization ability and avoids falling into local minima. In order to reduce the time consumed in determining the number of hidden neurons and randomly assigning weights between the extreme learning machines proposed by Huang et al. [

Deep Learning, or deep neural network, is the latest research direction in the field of machine learning. It can deeply mine the distribution characteristics of large traffic data and apply them to traffic incident detection, which can greatly improve the accuracy of incident detection. However, with the increase of network layers, their training efficiency is greatly reduced, and the probability of model falling into local optimum is increased. Regarding the difficulty of training, in [

To solve the above problems, Huang Guangbin, the founder of extreme learning machine, proposed a deep multilayer extreme learning machine algorithm. The multilayer neural network structure enables extracting high-level abstract information from data. At the same time, it can effectively solve the problems of high data dimension, difficult sample labeling, difficult feature construction, and difficult training in the era of big data. The literature in [

The document in [

The acquisition of traffic flow data and traffic incident data is of great significance to the study of traffic incident detection. The traffic flow data obtained in this paper mainly comes from the microwave detector data collected by Hangzhou urban expressway monitoring center in Hangzhou viaduct’s section for 5 months (from June 11, 2015, to November 11, 2015). Among them, the sampling [

Some data used for traffic incident detection.

Time | Microwave detector data | Label | |||||
---|---|---|---|---|---|---|---|

Upstream section | Downstream section | ||||||

Flow (pcu/min) | Speed (km/h) | Occupancy (%) | Flow (pcu/min) | Speed (km/h) | Occupancy (%) | ||

10 : 03 | 40 | 68.43 | 24.51 | 42 | 67.65 | 23.7 | 0 |

10 : 04 | 44 | 69.54 | 19.8 | 48 | 72.03 | 20.6 | 0 |

10 : 05 | 38 | 76.75 | 23.6 | 41 | 74.12 | 21.2 | 0 |

10 : 06 | 25 | 24.12 | 75 | 56 | 68.24 | 12.6 | 1 |

10 : 07 | 22 | 16.45 | 72.4 | 45 | 73.45 | 11.3 | 1 |

… | … | … | … | … | … | … | … |

The basis of traffic incident detection is the disturbance of normal traffic flow caused by traffic incident. Therefore, before constructing the incident detection algorithm, we must first analyze the traffic flow characteristics in the event state to determine the characteristic parameters of the model. Based on the theory of vehicle flow fluctuation, this paper analyzes the impact of traffic events on the [

In order to analyze the traffic flow characteristics in the event state, 50 groups of 107 groups of traffic event data are randomly selected for cross validation to analyze the impact of traffic events on the traffic flow characteristics, so as to eliminate the interference of time of the day, detector location, event type and location, downstream signal of the off-ramp, and other factors.

Through a large amount of cross validation detector data and event data, it can be found that the impact of traffic events on traffic flow will directly lead to the change of traffic flow parameters (such as flow, speed, density, and occupancy) at the event points and upstream and downstream sections. Therefore, the significant change of traffic flow parameters during the event occurrence period is the basic basis for the design of automatic traffic event detection algorithm. Among them, the change trend of upstream and downstream traffic flow parameters of the event location is shown in Figures

Change in traffic flow parameters for upstream detectors.

Change of traffic flow parameters for downstream detector.

In Figure

In Figure

In this paper, a complete set of initial variables is constructed based on the measured, predicted, and combined values of the traffic flow parameters in the upstream and downstream areas. The set of initial variables consists of seven parts: (1) the basic traffic parameters actually obtained by the upstream detector; (2) the basic traffic parameters actually obtained by the downstream detector; (3) the combination ratio of the actual traffic parameters of the upstream detector; (4) the group of the actual traffic parameters of the downstream detector; (5) the ratio of the measured traffic flow parameters and the predicted parameters of the upstream detector; (6) the ratio of the measured traffic flow parameters and the predicted parameters of the downstream detector; (7) the ratio of the measured traffic flow parameters and the predicted parameters of the downstream detector. The ratio of the measured traffic flow parameters of the adjacent detector is shown in Table

Initial variables set of traffic incident detection.

Variable serial number | Initial variable |
---|---|

1 | Measured flow rate of upstream detector |

2 | Measured speed of upstream detector |

3 | Measured occupancy of upstream detector |

4 | Measured flow rate of downstream detector |

5 | Measured speed of downstream detector |

6 | Measured occupancy of downstream detectors |

7 | Ratio of occupancy to flow measured at the same time by the same detector |

8 | Ratio of occupancy to velocity measured at the same time by the same detector |

9 | Ratio of flow to velocity measured at the same time by the same detector |

10 | Ratio of measured flow and predicted flow of upstream detector |

11 | Ratio of measured speed to predicted speed of upstream detector |

12 | Ratio of measured occupancy rate to predicted occupancy rate of upstream detector |

13 | Ratio of measured flow to predicted flow of downstream detector |

14 | Ratio of measured speed to predicted speed of downstream detector |

15 | Ratio of measured occupancy rate to predicted occupancy rate of downstream detector |

16 | Flow ratio of adjacent upstream and downstream detectors at the same time |

17 | Speed ratio of adjacent upstream and downstream detectors at the same time |

18 | Acquisition occupancy ratio of adjacent upstream and downstream detectors at the same time |

There are 18 initial variable sets in Table

The Bootstrap random sampling technique and the node random splitting technique are used to extract the new sample set from the training set and establish the decision number model. When random forests were sampled by Bootstrap, about 36.8% of the “out of bag data” were generated at a time (Out of Bag, OOB). Using OOB as a test set to evaluate the predictive performance of RF is called OOB estimation. OOB estimation is unbiased when the number of trees in RF is large enough.

For random forests that have been generated, we assume that the total number of OOB samples is

Feature importance measurement is an important feature of RF and can be used as a feature selection tool for high-dimensional data. Mean Decrease in Accuracy (MDA) is an important index to measure the importance of feature. Suppose the Bootstrap sample is

Cyclic convolution neural network consists of input layer, cyclic convolution layer, and output layer. It can extract classification features layer by layer through cyclic convolution layer and sampling layer. The last layer is softmax nonlinear classifier. The weights and biases of cyclic convolution neural network are trained by Newton algorithm, crossing entropy cost function as the object function of training cyclic convolution neural network and iteratively searching for the minimum optimal solution of the object function of cyclic convolution neural network. The feature extraction is a part of the whole classifier design. Therefore, it is advantageous to extract features from cyclic convolution neural network as the input of limit learning machine. The extreme learning machine is a single hidden layer neural network. Its input layer weights and offset values are randomly generated during the initialization process. Then, the weight biases from the hidden layer to the output [

In Figure

Cyclic convolution neural network is composed of input layer, cyclic convolution layer, and output layer of pooling layer. Quasi-Newton method is used to train cyclic convolution neural network to realize feature extraction of image target.

Cyclic convolution neural network is used to extract the features of image objects as the input layer of the extreme learning machine to calculate the parameters of the extreme learning machine.

Use extreme learning machine to classify image objects.

Deep cycle extreme learning machine training.

Deep cycle extreme learning machine training.

The combined algorithm of convolution neural network and extreme learning machine is as follows: input training sample data, class label information of target, and output training model of cyclic convolution neural network.

Because the initial goal of extreme learning machine is to solve single-hidden-layer feedforward neural network, assuming that a hidden single-layer neural network has

In formula (

Then the objective function of the minimum learning output error of the single-hidden-layer neural network can be expressed as

Therefore, the objective function of the extreme learning machine can be expressed as

In the training process, the gradient of Newton method near saddle point decreases slowly and easily falls into local optimum, which makes it difficult to train the model. Newton’s method needs to calculate the black matrix of the objective function (Hessian) in the calculation process, and it cannot guarantee that the black matrix of the objective function is always positive definite.

The second-order partial derivative of the objective function needs to be calculated, which is too large and difficult to store, so that the direction of the algorithm does not always descend. Thus, the Newton method fails. In order to solve this problem, an improved BFGS algorithm based on quasi-Newton method is adopted.

Newton’s algorithm updates the parameters of the deep-loop neural network model in the optimization process as follows:

In order to overcome the shortcomings of Newton’s method, quasi-Newton equation is adopted, and the initial value of second derivative is replaced by

In order to better optimize the nonconvex objective function, the improved Newton algorithm is adopted as follows:

Generally speaking, the weights

In order to verify the performance test experiment of the proposed expressway traffic event detection algorithm, the experimental environment is shown in Table

Experimental environment configuration.

Project | Environmental values |
---|---|

Server | Dell PowerEdge T640 |

Processor | Intel® Xeon® Bronze 3106 CPU @ 1.70 GHz |

Memory | 128 GB |

Video card | 4 |

Operating system | Ubuntu 16.04 |

Development environment | Python 3.6, TensorFlow, Keras |

The database includes 107 traffic events (10294 samples in total). 55 traffic event data are randomly selected for training and the remaining 52 traffic event data are used for testing. Because the amount of nonevent data is too large, nonevent samples are usually randomly selected to build training set and test set. In order to retain the information of nonevent samples to a large extent, the training set and test set are used. The proportion of event samples in test set is set to 20%. The composition of training set and test set is shown in Table

Composition of training set and test set sample.

Data set categories | Total sample size | Event sample number | Event number |
---|---|---|---|

Training set | 26190 | 5240 | 55 |

Test set | 22978 | 4595 | 52 |

In practice, the number of traffic event samples is far less than the number of nonevent samples, and the number of two types of samples is unbalanced. Therefore, traffic incident detection can be regarded as a two-classification problem of unbalanced data. Synthetic minority sampling technique (SMOTE) is a commonly used oversampling technique. SMOTE can generate new samples that do not exist in the original sample. Therefore, to a certain extent, it avoids the hyperfitting of the classification algorithm. The standard SMOTE is used to balance the traffic incident detection sample. The specific steps are as follows:

For each sample in the event sample set

The sampling rate

According to formula (

In the equation, rand(0, 1) represents a random number belonging to a large interval of 0,1.

Merge the newly generated event sample with the original sample set to obtain a relatively balanced training sample set.

In order to make the two types of samples relatively balanced, SMOTE is used to increase the sample of traffic events in the training set. The specific parameters of SMOTE are set as follows: the number of adjacent sample points is 5, and the oversampling rate is 30,000. The number of samples in the balanced training set is 20360. In order to eliminate the effects of different dimensions, improve the training speed and classification effect, and normalize the data to the interval [0, 1], the normalization formula is

In the formula,

Three basic traffic flow parameters, namely, traffic flow, speed, and occupancy, can be obtained by using remote microwave detector. The sampling interval of data is 5 minutes. By analyzing the changing trend of traffic flow parameters, 123 main road traffic incidents were screened out artificially, of which 71 were on the east side and 52 were on the West side. Traffic incident data are classified according to the east main line, the west main line, and the whole road section. Three sample data sets are formed with corresponding normal state data. Two-thirds of each data set are used as training samples, and the rest are used as test samples.

Random forest algorithm is used to measure the importance of initial variables, and then key variables that are more sensitive to traffic incidents are selected. Among them,

Importance of variable.

In order to reflect the role of key variables screening, we should not only select as few variables as possible but also ensure the correct rate of traffic incident detection. Through comparative analysis, the four [

Particle swarm optimization (PSO) is used to obtain the optimal parameters of the combined kernel function. For general problems, the set range of particles is 20–50. For specific problems, the number of particles can be taken to 100–200. The larger the spatial range of search is, the easier it is to find the global optimal solution and of course the longer the algorithm runs. Considering that the number of particles in the event detection problem is 20, it can solve the problem of traffic incident detection and improve the training efficiency of the algorithm. The specific parameters of PSO are as follows: the number of particles is 20, the dimension of particles is 3, and the acceleration factor

Fitness curves of PSO optimization.

As can be seen from Figure

In order to better evaluate the detection performance of the deep cycle extreme learning machine algorithm (DELM) established in this paper, long short term memory (LSTM) algorithm, deep belief network (DBN) algorithm, convolutional neural network (CNN) algorithm, and gradient boosting decision tree (GBDT) algorithm are selected for comparison.

The performance evaluation indexes of traffic incident detection algorithm include detection rate

In the above formula,

Generally speaking, the three indicators of

The initial variables and important variables were used to construct the training set to test the performance of these five algorithms. The results are shown in Table

Comparison of detection effects of different detection algorithms.

Detection algorithm | Variable selection | Initial variable | ||||
---|---|---|---|---|---|---|

DELM | 98.42 | 0.56 | 0.68 | 92.20 | 0.63 | 1.09 |

LSTM | 94.64 | 0.48 | 0.90 | 86.52 | 0.65 | 1.24 |

DBN | 96.41 | 0.50 | 0.88 | 88.86 | 0.70 | 1.12 |

CNN | 96.24 | 0.54 | 0.94 | 86.42 | 0.66 | 1.14 |

GBDT | 95.81 | 0.50 | 0.92 | 87.86 | 0.76 | 1.21 |

Persistence test (PT) is an effective way to reduce

When there is no continuous test, PT = 0.

(a)

As can be seen from Figure

As can be seen from Figure

In summary, the event detection results of the DELM algorithm continuous [

(1) The random forest algorithm used in the paper can effectively select important variables for traffic incident detection, reduce the input dimension of traffic incident detection algorithm, and improve the performance of traffic incident detection algorithm. (2) The performance of DELM algorithm is better than that of LSTM, DBN, CNN, and GBDT algorithms. (3) When PT = 1, each algorithm can better balance the three indicators of

In order to get a more general conclusion, in future research, the DELM algorithm needs to be used in other traffic incident data sets and theoretically analyze and demonstrate the superiority of DELM for traffic incident detection. In addition, to construct a more comprehensive traffic, the initial variable set for event detection needs further discussion.

The data supporting the conclusions of this study are presented in the figures and tables of the article. The code and details involved in this paper are available upon request from the corresponding author.

The authors declare that there are no conflicts of interest regarding the publication of this paper.