Detection and Optimization of Traffic Networks Based on Voronoi Diagram

,


Introduction
Accurate information about traffic peaks is important in planning, maintenance, and control of modern transport systems. Traffic congestion has become widespread in large metropolitan areas. Peak-hour congestion is also a problem in many suburban areas. Traffic peaks can be used to compute congestion indices, helping individuals and traffic management units to avoid traffic jam and minimize travel time [1][2][3][4]. By monitoring traffic peaks, transportation administrators can make better decisions to optimize the traffic networks and therefore enhance the performance of transportation systems [5][6][7]. Automatic traffic event detection is important in intelligent transportation systems (ITSs). e detection can be performed with streaming data clustering [8], marking points clustering [9], and deep learning methods [10].
Effective peak detection remains a challenging problem in many fields, including asteroseismology [11,12], analytical chemistry [13,14], disease diagnosis [15,16], and traffic analysis [7,17]. Mathematically, a peak of a function is a local maximum point where the height of the function is the greatest in an interval around the point. Billauer developed an algorithm that detects peaks using the local maxima (peak) and minima (valley) values [18,19]. In [20], the first-order derivative of chromatographic signal is computed to automatically extract peaks in the data. In [21], overlapped peaks are detected by examination of the second derivative of the raw mass spectrum. In existing algorithms, smoothing techniques are needed to eliminate or reduce noise [7,[13][14][15]. e major problems with the existing peak detection methods are related to the smoothing step: (i) e smoothing often removes the fine details and makes the peaks vanish or degrade. e effect is unrecoverable; that is to say, if a real peak is removed/degraded during the smoothing step, it is not recovered in the subsequent analysis.
(ii) e goal of smoothing is to eliminate or reduce the fluctuation and noise to avoid weak and false peaks, and it is crucial to select an appropriate smoothing scale. Choosing such a scale for a given traffic data requires prior knowledge about the data, which is often not available before the data analysis is finished.
For traffic flow data, most fluctuation and noise have smaller amplitude than the peaks we are interested in. Based on this property, we introduce a Voronoi diagram-based traffic peak detection method.
e new method has the following advantages. Firstly, it works on noisy data directly and avoids the difficulty in choosing an appropriate smoothing scale. Secondly, it computes the prominence of candidate peak points and offers the subsequent analysis step the flexibility to choose peaks at any scale. e algorithm takes O(n log n) time and linear space, where n is the size of the input time series. e rest of this paper is organized as follows. We introduce the related work in Section 2 and describe in detail the proposed method in Section 3. Following that in Section 4, we give the experiment results. We conclude the paper with a brief discussion in Section 5.

Related Work
Traffic network optimization improves a traffic network, maximizing its performance. e problem is formulated as an NP-hard integer program, and there exists no exact polynomial-time algorithm; heuristic methods such as genetic algorithms (GAs) are common ways to solve the problem [22]. An enhanced GA was introduced in [22], which performs the mutation in a goal-oriented manner by local optimization: changing an individual into the best one in its neighborhood. e enhanced GA performs better than a standard GA with respect to the trip duration.
Mao et al. [23] proposed a traffic signal control optimization method using GAs and machine learning models, minimizing the total travel time in urban networks. ey assume that the O-D demands are predefined and use the traffic assignment model to get the deterministic link traffic flows. In case study, they applied a static traffic assignment modelling scenario to obtain the initial link flows assigned to each road section during morning peak. Comparison shows that the proposed method is faster than the original GAs and can be successfully applied under nonrecurrent incident conditions. Paricio et al. proposed evolutionary algorithms for the traffic weighted multimap (TWM) technique to solve the traffic assignment problem, minimizing total travel time and providing the best-cost routes to vehicles [5]. e complexity of optimal TWM distributions depends on several factors, like the O/D directionality and traffic flow volumes. Obtained results show that TWM performs good in optimal routing at the system level, avoiding the need for continuous route calculus based on traffic status data streaming.
Bianchin and Pasqualetti [6] studied the problem of controlling intersections with the goal of optimizing network-wide congestion. ey cast an optimization problem to describe the goal of optimally controlling automated intersections and relate congestion objectives with the problem of optimizing a metric of controllability of the associated dynamical system. e system performance is characterized in relation to optimal configurations, and conditions are identified to guarantee network stability. With respect to established models, the framework relies on a continuous time and discrete space characterization of the traffic flows and allows for a more tractable network analysis.
When assessing and optimizing traffic networks, traffic flow peaks are an important parameter. However, peak detection is still a challenge since peaks in real applications are sensitive to local variation and noise. Simple maxima or derivative methods yield poor results when the signal contains noise. Smoothing techniques are applied to eliminate or reduce noise. Common smoothing methods used in peak detection include average [7,24], Gaussian [13,25], Savitzky-Golay [15,16], and wavelet smoothing [11,14].
A rise/fall analysis was used to detect air traffic peaks [7]. A peak exists where there is a change from rising to falling. To smooth out the noises, a 10th order rolling average is applied before the peaks are identified. Zhao et al. [24] presented a stripe peak detection method for structured light-based 3D reconstruction. In this approach, a weighted average method is applied to determine accurate peak position in stripe areas after they are segmented.
Lu et al. [25] introduced an adaptive ion-connecting network-based strategy for ion chromatogram (IC) extraction based on which a multiscale Gaussian smoothing-based peak detection method is applied after bad ICs are removed. Fu et al. [13] proposed a multiscale Gaussian smoothingbased approach for chromatographic peak extraction, where peaks appear as local maximum values under various smoothing window widths and can be detected through the ridge lines of maximum values under these window scales and of signal-to-noise ratios greater than a threshold. e work in [15] proposed a technique for T-wave peak detection using simple root mean square-based decision rule. A Savitzky-Golay smoothing filter is applied for preprocessing. e filter essentially performs a local polynomial regression on the signal to determine the smoothed value for each point. e approach in [16] detects R peaks in ECG signals, where the Shannon energy envelope with the Savitzky-Golay filter was designed to suppress noise and enhance the QRS-complex. e algorithm in [11] uses a continuous wavelet transform (CWT)-based pattern-matching approach. e CWT serves as a pattern-matching function where the signal is compared with a wavelet function specifically chosen to have similar features as the most common peaks that contain signal. Zheng et al. [14] proposed a peak detection algorithm 2 Discrete Dynamics in Nature and Society based on continuous wavelet transform, which identifies peaks by drawing ridges based on the movement of particles in the CWT coefficient matrix.

Proposed Method
Daily traffic volume data are a time series or mathematically a discrete function, consisting of a list of 2D points, and each point is a time-volume pair. With the point list as input, our method takes 4 main steps, as shown in Figure 1. Given a collection of geometric primitives, its Voronoi diagram is a subdivision of space into cells such that all points in a cell are closer to one primitive than to any other [26,27]. For a generalized Voronoi diagram, the geometric primitives can be points, line segments, or circular arcs. In our algorithm, we compute the Voronoi diagram of a polyline, where the geometric primitives are either line segments or points (endpoints of line segments). Based on the Voronoi diagram, we give our definitions (Section 3.1) and report the algorithm (Section 3.2).

Definitions.
We first define function curve. Definition 1. Given a discrete function, we connect the adjacent points by line segments and get a curve, called the function curve.
A function curve is a polyline. From the example in Figure 2, we can see that the Voronoi diagram divides the plane into cells, each of which is associated with a line segment or an endpoint of a line segment.
We define Voronoi tree.

Definition 2.
e Voronoi diagram of a function curve is partitioned by the function curve into two parts, upper and lower parts; for each Voronoi edge e, (i) If e crosses the function curve, then e is partitioned into two parts, e u and e l , which are, respectively, above and below the function curve. We put e u and e l , respectively, to the upper and lower parts of the Voronoi diagram. (ii) Otherwise, e is put to the upper/lower part of the Voronoi diagram if it is above and below the function curve.
e edges in the lower part of the Voronoi diagram make a forest; we add a virtual root node to make it rooted. e virtual root represents an infinite point. e rooted tree is called the Voronoi tree of the function (see Figure 3).
To define the prominence of a Voronoi edge and of a function point, we use as an example a time series (Figure 4) of traffic flow data in vehicles per 15 minutes over 24 hours.
Definition 3. In a Voronoi tree, the subtree led by a Voronoi edge represents a hill in the function curve ( Figure 5(b)). We find a top point in the hill and record it as the peak site of the Voronoi edge; we compute the height difference between the peak site and the higher of the left and right bottom points of the hill and record it as the prominence of the Voronoi edge ( Figure 5(b)).

Definition 4.
For a function f, we define the prominence of a function point p as the greatest prominence of a Voronoi edge with p as its peak site. e function points of positive prominence are the candidate peak points.
In computing the Voronoi diagram, since the x and y values of time series data are of different features (time and traffic volume, respectively), we need to unify them by a feature normalization process. In our algorithm, we adopt a commonly used normalization approach: scaling the time and traffic volume values of a time series to the range [0, 1]. (Figure 1), and among them, the first 2 steps are straightforward by definition. We address the 3 rd and 4 th steps.

Algorithm. Our algorithm consists of 4 main steps
We first compute the prominences of the Voronoi edges. For any Voronoi edge, if its left/right Voronoi cell is associated with an endpoint of a line segment, we let its left/ right input site be the endpoint; otherwise (it is associated with a line segment), we let its left/right input site be the right/left endpoint of the line segment.
For each Voronoi edge e in a Voronoi tree, we denote its input sites as p i and p j with i ≤ j. We compute the following properties of e. e algorithm to compute these properties is given in Algorithm 1. e prominences of the points are obtained straightforwardly by definition. We collect the points of positive prominences as the candidate peak points (Figure 6(a)) and sort their prominences ( Figure 6(b)), making the sorted prominence curve. We then connect the first and last prominence dots by a line (the green dash line) and find in  Discrete Dynamics in Nature and Society the prominence curve the dot (the yellow dot) farthest from the line. is dot is called the prominence bending dot. e prominence dots before the bending dot represent the peak points detected (Figure 7(a)).
For a peak point, the hill represented by the Voronoi edge of the greatest prominence is called the supporting hill. e supporting hills of the peak points are not exclusive from each other; they are nested instead (Figure 7(b)).

Computation Time and Space.
In the following discussion, we assume that the input time series has n points. e total computation time is O(n log n), where (i) We first construct the Voronoi diagram, taking O(n log n) time [26]. e diagram has O(n) edges.
(ii) We then examine each Voronoi edge and split the diagram into 2 parts (partitioned by the function curve) and make the lower part the Voronoi tree, taking O(n) time.     (2) Let the two input sites be p i and p j with i ≤ j.

Experiments
We implemented our algorithm in C++ and ran experiments on a inkPad X270 laptop with 2.80 GHz i7-7500U CPU and 8 GB RAM. We performed experiments on the data provided by the Traffic Management Bureau of Hebei Provincial Public Security Department in China. e bureau maintains a road traffic congestion index system by integrating the real-time traffic data from taxi GPS, traffic patrols, and road detectors. e system dynamically monitors traffic volume peaks, which are fused with occupancy and speed data to compute the traffic congestion indices.
We report experiments on 6 selected series in Section 4.1 and on a set of series with quantitative performance in Section 4.2. e experiments in Section 4.1 show that noise in the traffic data has different scales, and the existing smoothing-based methods fail on some of the series since there does not exist a uniform smoothing scale for all of the series. On the other hand, the proposed method adapts itself to different noise scales by finding the bending dot in the sorted prominence curve. e statistical test in Section 4.2 uses performance indices of sensitivity, positive predictivity, and accuracy. For each existing smoothing-based method, we choose a best parameter such that the overall accuracy is maximized. In the comparison, our method outperformed the smoothing-based methods on all the performance indices. e experiments verify the advantages of the proposed algorithm, which avoids the difficulty in choosing an appropriate smoothing scale and offers the subsequent analysis step the flexibility to choose peaks at any scale.

Experiments on Examples.
We choose 6 time series of traffic volume data, namely, Road1 to Road6 data (Figure 8), to demonstrate the effectiveness of our method. e traffic volume was measured in vehicles per 5 minutes.
For each time series, we computed the prominences of the data points, got the candidate peak points (points with positive prominences), sorted their prominences, and selected the peak points by finding the prominence bending dot. e supporting hills of the peak points are nested, and their prominence dots form a steep section in the prominence curve. Variation and noise in the time series often have repeated patterns, and their prominence dots form a slowly declining section. e two sections are separated at the bending dot. We report the sorted prominences, the peak points detected, and their supporting hills of the Road1 to Road6 data, respectively, in Figures 9-14. In all tested time series data, our method successfully detected the salient peaks, even in Road5 and Road6 data, where the prominence of the last detected peak (the third one in both data) is not much greater than the greatest prominence of the noise. e results demonstrate the effectiveness of our method in traffic data with noise.
For comparison, the average, Gaussian, Savitzky-Golay, and Wavelet smoothing methods were applied to the data, followed by Billauer's peak picking method. Billauer's method has the ability to skip peaks of amplitude less than a threshold. We set the amplitude threshold as 2, which is obtained through trial and error. e average, Gaussian, and Savitzky-Golay methods use a smoothing filter of a given size. Savitzky-Golay filter requires the filter size to be odd. We chose 5 filter sizes, 9, 19, . . ., 49.
e results of the average, Gaussian, and Savitzky-Golay smoothing are shown in Figures 15-17, respectively. In the figures, the x-coordinates represent the time in unit of 5 minutes. We list the detected results of the 3 filters of the 5 sizes in Tables 1-3, respectively, and see that there does not exist a single size with which the 3 filters work successfully on all data. e best results are achieved by Savitzky-Golay filter of sizes 19 and 29, each of which only fails on only one series. Discrete Dynamics in Nature and Society e noise in the 6 series has different scales, and we cannot use one uniform smoothing scale for all of them. e proposed method adapts itself to different noise scales by finding the bending dot in the sorted prominence curve, which is then separated into rapidly and slowly declining sections.
Wavelet smoothing transforms the data into wavelet space and thresholds small coefficients and performs the inverse wavelet transform to get the smoothed data. We used 3 wavelets, db4, db6, and sym5 in the experiments and report the results in Figures 18-20. In the experiments, the wavelet db4 failed on Road3 and Road4 data, db6 failed on Road2 and  Road6, and sym5 failed on Road1, Road2, and Road6. Our method transforms the noisy data to sorted prominences, similar to a wavelet transform decomposing the signal spectrum into different frequency ranges. e experiments demonstrated the superiority of the sorted prominences to the wavelet transforms.

Quantitative Comparison.
To compare quantitatively, we use one week (2018) daily data, from 12 midnight to 12 midnight, of different roadways, including 10 national highways, 10 provincial highways, and 10 urban roads. e data were sampled every 10 minutes, and each data series contains 24 * 6 � 144 points. For the average, Gaussian, and Savitzky-Golay smoothing methods, we used 20 filter sizes, 5, 7, 9, . . ., 43. For wavelet smoothing, we used 3 wavelets, db4, db6, and sym5. Sensitivity (SE), positive predictivity (PP), and accuracy (AC) were used as performance indices. For a method m with a parameter p on a series data d, , where the true positive (TP) is the number of correctly detected peaks in the series, the false negative (FN) is the number of ground-truth peaks undetected, and false positive (FP) is the number of nonground-truth peaks detected as peaks.      Our method on Series2 Figure 21: e detected peaks on 3 series by our method and the smoothing-based methods with a best parameter (as listed in Table 4). For each series, the sorted prominences are also depicted. e peaks by our method are marked and labeled on the raw data, and the peaks by the smoothing methods are marked on the smoothed data.
We computed the overall performance of a method m with a parameter p on all series as the average performance of m with p on all series: For each existing smoothing-based method, we choose a best parameter p such that the overall accuracy AC(m, p) is maximized.
e overall performance of the 4 smoothing methods with a best parameter is compared with ours, as reported in Table 4. In the table, the best parameters are given in parentheses. In the comparison, our method outperformed the 4 others in sensitivity, positive predictivity, and accuracy. In Table 4 (last column), we also list the average computation time in milliseconds, which shows that the computation time of the proposed method is comparable to that of the smoothing-based methods.
To verify that the proposed method can effectively avoid the dilemma in choosing an appropriate smoothing scale, we extended the existing smoothing-based methods with an "oracle" which "knows" how to choose a best parameter for each series. We then compared the proposed method with the idealized smoothing-based methods. For a smoothing-based method on a series, we choose a parameter with the best accuracy, and let the SE and PP of the parameter be the SE and PP of the method on the series. We then compute the average of the SE and PP on all series, which are used as the overall SE and PP of the idealized method, respectively. Formally, for a method m and a series d i , let p(m, i) be a parameter p which maximizes the accuracy AC(m, p, d i ).
en, we define Notice that the parameter p in formulas (5) and (6) is series-dependent as is different in formulas (2) and (3). e overall performance of the idealized smoothing methods is reported in Table 5, revealing that our method is comparable in accuracy to the smoothing-based methods with an oracle, which of course does not exist in reality. is comparison verifies again the effectiveness of the proposed method.
We select 3 series to discuss the strength and the weakness of the proposed method ( Figure 21). e 3 series belong to one road on 3 different days. e road is a provincial 2-way 4-lane highway. In all 3 series, the first 2 peaks are visually salient and reported as detected peaks by all methods. In series 1, the 3 rd peak (the thorn shape between the first 2 peaks, not labeled) has a much lower prominence than the first 2 and is correctly classified as not detected by our method and the average smoothing. In series 2, the 3 rd peak has a prominence comparable to the 2 nd peak and is correctly detected by all methods except the average smoothing. In series 3, the 3 rd peak is a false positive though its prominence is also comparable to that of the 2 nd peak. If we compare the 3 rd peaks in series 2 and 3, we find that the peak in series 3 is sharper and covers a shorter time period, and it is more likely a noise or fluctuation. In this case, a simple shape-based outlier removal process can be applied to remove this false peak. In general cases, whether a salient sharp peak is a false peak cannot be determined solely by traffic volume. Other traffic parameters, such as speed, density, and headway, are needed. As a prestep, our method provides salient candidates for further comprehensive analysis where multiple traffic parameters are involved.

Conclusion
We introduced a traffic peak detection method based on Voronoi diagram. e method computes the prominence of candidate peak points and selects the salient peaks by finding the bending dot in the sorted prominence curve. It works directly on noisy data and avoids the difficulty in choosing an appropriate smoothing scale. e prominence of candidate peak points also offers the subsequent analysis step the flexibility to choose peaks at any scale. e introduced method relies on the assumption that noise has smaller amplitude than the salient peaks. For data not satisfying the assumption, an outlier removal process is needed. Other future work includes improving the traffic network model by computational intelligence approaches, such as monarch butterfly optimization [28], earthworm optimization [29], and elephant herding optimization algorithms [30].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.