Intelligent Transportation System (ITS) technologies can be implemented to reduce both fuel consumption and the associated emission of greenhouse gases. However, such systems require intelligent and effective route planning solutions to reduce travel time and promote stable traveling speeds. To achieve such goal these systems should account for both estimated and real-time traffic congestion states, but obtaining reliable traffic congestion estimations for all the streets/avenues in a city for the different times of the day, for every day in a year, is a complex task. Modeling such a tremendous amount of data can be time-consuming and, additionally, centralized computation of optimal routes based on such time-dependencies has very high data processing requirements. In this paper we approach this problem through a heuristic to considerably reduce the modeling effort while maintaining the benefits of time-dependent traffic congestion modeling. In particular, we propose grouping streets by taking into account real traces describing the daily traffic pattern. The effectiveness of this heuristic is assessed for the city of Valencia, Spain, and the results obtained show that it is possible to reduce the required number of daily traffic flow patterns by a factor of 4210 while maintaining the essence of time-dependent modeling requirements.
In densely populated urban areas, traffic-related problems, such as air quality, noise, vibration, and accidents, are critical issues for management authorities. In terms of solutions to make traffic flow more efficient or to reduce it, especially in downtowns, authorities develop initiatives to promote the use of public transportation, forbid access to the most polluting vehicles, alternate the days of downtown access according to the vehicles’ plate number, charge drivers for access, and so forth. In addition to these initiatives, traffic engineers analyze the traffic flow in our cities taking into account important factors like the adequate street directions to minimize travel times, influence of traffic lights synchronization and placement in traffic congestion, fuel consumption and CO2 emissions, traffic noise modeling [
Particularly, in the field of fuel consumption and exhaust pollutant, Intelligent Transportation Systems (ITS) have recently emerged as a powerful ally in order to improve traffic flows [
In this paper we present a novel platform for centralized traffic management in urban environments which attempts to avoid known problems associated with current route planning solutions based on fixed path costs. The proposed solution takes into account the historical data about traffic patterns in order to provide time-dependent route recommendations to drivers traveling through dense traffic areas. As a first approach to deploy this solution, we propose using existing traffic measurements based on induction loop detections [
The paper is organized as follows: in the next section we introduce some related works. In Section
After several decades of research, the existing traffic engineering literature is quite broad and extensive. Recently, some solutions have emerged that rely on mobile devices to monitor the traffic in real time, for example, the Mobile Millennium [
Among these proposals we can find TrafficView [
Moreover, when attempting to solve the vehicle route planning problem in the most accurate way, we must take into account the traffic variability throughout the day, as well as other situations that take place in real life when driving a vehicle [
To tackle this increase of complexity, we present in this paper an approach to significantly reduce the amount of data that our platform will need to find the time-dependent shortest routes. Specifically, we detail how to aggregate large amounts of historical traffic flow data into the most meaningful set of information to properly describe traffic flow variations throughout the day on the different streets and avenues of a city.
To this aim, we will use a clustering technique. Cluster analysis is an unsupervised learning technique used for the classification of data. Data elements are partitioned into groups called clusters that represent proximate collections of data elements based on a distance or dissimilarity function. There exist two main clustering methods. The hierarchical methods basically start with each member of the set in a cluster of its own and fuse nearest clusters until there are
Clustering techniques have been already used in the last years as part of ITS solutions in order to provide real insights into traffic management policies. For briefness, we only refer to some of these works. We recommend consulting Guardiola et al. [
For example, Wang et al. [
Current vehicle navigation systems are typically based on locally stored static information from which routes are calculated. Among such systems we can find commercial applications like TomTom (
More sophisticated route navigation solutions update route information in real time, based on reported traffic conditions. As an example, the TomTom navigation software has been enhanced to support client-server interaction in order to inform clients about alternative routes when atypical traffic delays are detected.
In this paper we will address the specific problem of traffic congestion in urban environments. Instead of accidents and other conditions causing atypical delays, we will focus on predicting daily traffic flow patterns for a specific urban environment, detailing how it is possible to reduce travel times based on historical information about the traffic density distribution throughout the day.
The proposed traffic management platform is named ABATIS:
ABATIS traffic management architecture.
Clients contribute to improving the route database information by providing real-time feedback about traffic congestion conditions, which allows maintaining both a real-time map of traffic fluidity in a city and accurate historical data of traffic behavior. This approach supports global traffic load balancing and event-based management (e.g., reducing traffic congestion in the route of an ambulance).
This strategy, although offering significantly better routes, has a higher cost since the estimated time for traversing each path segment will no longer be a fixed value based on segment length and speed limit, but instead it will vary dynamically along the day. In order to achieve time-dependent costs for the different streets and avenues in a city, ABATIS will use existing historical data about traffic logs in a city to estimate travel times. Since such logs provide per-hour congestion measurements for all induction loop detectors in a city for a whole year, they must be properly summarized and synthesized by the traffic analysis server to allow seamlessly integrating such information in the route server. Thus, in the remainder of the paper, we will focus on the traffic analysis component, proposing a heuristic able to reduce the complexity of the problem by converting huge amounts of historical data about traffic intensity into a small but representative set of daily patterns able to describe the expectable traffic behavior in the city along the day.
Attempting to model the daily traffic flow pattern of hundreds of streets/avenues for every day of the year would lead to hundreds of thousands of interpolation functions able to provide a smooth description of per-street traffic flow variations throughout the day, based on several million input values (assuming a per-hour granularity). Such modeling effort for a single city can be considered excessive and, in addition, causes route recommendation tasks at the server to have an extremely high computational cost. Nevertheless, when attempting to provide an accurate characterization of path segment costs in a specific urban environment, it quickly becomes clear that (i), from a yearly perspective, seasonal differences are expectable as, for example, more people use their vehicles during cold weather seasons than during the warm and hot seasons where, for example, bicycles or public transport can become a more attractive alternative; (ii), from a weekly perspective, labor days are characterized by mobility patterns and traffic congestion states that drastically differ from the behavior during weekends and holidays; (iii), from an hourly perspective, different hours of the day are associated with different congestion levels (e.g., day versus night); and finally (iv), from a spatial perspective, different streets/avenues have different traffic levels at any time of the day, requiring independent modeling.
Taking the aforementioned factors into consideration, in this section we will take an in-depth look into traffic behavior when focusing on a medium-size European city like Valencia, Spain, which is the third largest metropolitan area in Spain with about 1.77 million inhabitants. Detailed trace files containing the amount of traffic flowing in each of the streets/avenues each hour for a full year (2013) were provided to us by Valencia’s City Hall Traffic Department, in particular, data concerning the 421 most relevant streets/avenues (those monitored by traffic services through induction loop detectors).
Our goal is to obtain insight into the traffic flow, detecting traffic patterns according to the day of the week, hour, and type of street. Based on the traffic patterns detected, we will propose a heuristic in order to simplify the number of models required while maintaining most of the time-dependent modeling effectiveness. Although we use the city of Valencia as the target of our analysis, the modeling methodology followed is quite general, being applicable to other cities as well.
We start by analyzing the monthly traffic, assessing whether we can detect significant seasonal differences. As shown in Figure
Average traffic volume in Valencia per month.
For the analysis that follows we picked a month with an average overall traffic volume close to the mean; specifically, we selected November, which has no holiday periods. Focusing on the traffic pattern variation throughout the week, Figure
Average traffic volume in Valencia for the different days of the week.
In addition to the differences in terms of daily traffic volume, there are also clear differences in terms of the daily traffic pattern itself. For instance, Figure
Average daily behavior for different days of the week.
Monday
Sunday
A totally different pattern is detected, for example, on a Sunday. Compared to weekdays we find that (i) work-related traffic peaks are no longer present; (ii) the total traffic volume is significantly lower; and (iii) the peak hours differ. In particular, peak hours are now related to mobility towards food courts at lunch time (between 1 and 2 p.m.) and mobility from relax areas to homes (between 6 and 8 p.m.).
When focusing on the traffic distribution throughout a city, it is well known that main streets and avenues will experience a much higher traffic load than secondary and isolated ones. Discriminating between them is a relevant issue since some streets barely experience any traffic load increase during peak hours, meaning that travel times are not affected by congestion in the same way as the main arteries of the city.
To be able to discriminate between the streets of Valencia based on traffic flow, we first obtained the peak traffic intensity per street during November, and we then obtained the cumulative distribution for these values (see Figure
Cumulative distribution for traffic intensity using the monthly peak hours.
We observe that 30.3% of all streets have a traffic intensity lower than 690 vehicles/hour during peak hours, which according to [
Daily traffic intensity pattern for streets with different characteristics.
Street following the expected pattern
Street not following the expected pattern
Observing the daily traffic pattern in Figure
In this section we propose a heuristic to simplify traffic modeling for the city of Valencia by taking into consideration the results presented in the previous section.
The proposed heuristic aggregates into a single pattern all those daily traffic patterns having a common behavior. This is made possible by making the obtained time-dependent models independent of the actual number of vehicles in each street through normalization using the mean daily value.
To this aim, we use Mathematica 9.0.1 [
We have chosen the partitioning method of
At this point, we want to stress the fact that while [
Finally, note that although we have not made use of them, function
Below we describe the five steps followed to reduce the number of independent daily patterns to be modeled: (i) select the appropriate clustering metric, (ii) find the optimal number of clusters per day of the week, (iii) determine how representative mean days are, (iv) group days of the week with similar characteristics, and (v) group clusters with similar daily patterns.
If for each street (or street segment) we have the number of cars that traverse it every hour, we can represent each street by a point
Streets
With respect to streets
However, if we classify the four streets using the Euclidean distance, the result is quite predictable:
Recall that
On the other hand, it is easy to see that the correlation distance is the same if we work with the coordinates
Using the correlation distance defined previously, in this section we will determine the optimal number of clusters for the 292 streets in Valencia considered by the City Hall as representative in terms of traffic flow for every day of the week. Subsequently, to reduce the overall number of clusters, we will attempt to join the different days in a week whenever the same number of clusters are detected.
Therefore, for our analysis, we apply the
In the analysis that follows we work with the percentage of vehicles traversing each street every hour with respect to the overall daily value. As referred in the previous section, the actual number of vehicles
Since our study period encompasses 4 weeks, we create an “average day” for each day of the week, which is calculated for each street by averaging the number of vehicles traversing it each hour. Such “average day” attempts to filter out the peculiarities of a specific day, obtaining a representative trend instead.
Table
Number of clusters obtained and associated statistics.
Mo | Tu | We | Th | Fr | Sa | Su | |
---|---|---|---|---|---|---|---|
A: Week 1 | 3 | 3 | 1 | 2 | 1 | 2 | 3 |
B: Week 2 | 1 | 2 | 4 | 1 | 3 | 2 | 4 |
C: Week 3 | 5 | 3 | 1 | 4 | 2 | 2 | 3 |
D: Week 4 | 3 | 1 | 1 | 3 | 3 | 2 | 1 |
E: mean(A, B, C, D) | 3 | 2.25 | 1.75 | 2.5 | 2.25 | 2 | 2.75 |
F: median(A, B, C, D) | 3 | 2.5 | 1 | 2.5 | 2.5 | 2 | 3 |
|
|||||||
G: average day | 4 | 2 | 2 | 4 | 3 | 2 | 2 |
H: round(E) == G | False | True | True | False | False | True | False |
I: mean(E, F, G) | 3.3( |
— | — | 3 | 2.58( |
— | 2.58( |
|
|||||||
Number of clusters |
|
|
|
|
|
|
|
Once the number of clusters for each day of the week was defined, the next step was to validate that cluster elements for each day of the week resembled the cluster elements obtained for the average day. If a good degree of matching is obtained, then the conclusions associated with streets in that cluster are valid; otherwise, we could be considering that streets belong to a group with a specific behavior, when in fact their behavior significantly differs.
For our endeavor we apply the
Percentages of matching for the different clusters compared to the average day clusters.
Mo | Tu | We | Th | Fr | Sa | Su | |
---|---|---|---|---|---|---|---|
Number of clusters |
|
|
|
|
|
|
|
|
|||||||
Week 1 | 83.11 | 92.31 | 84.42 | 30.86 | 70.15 | 91.98 | 73.72 |
66.67 | 84.56 | 62.32 | 59.32 | 68.75 | 71.43 | 66.67 | |
81.33 | 56.99 | 90.43 | 35.82 | ||||
|
|||||||
Week 2 | 60.14 | 89.74 | 80.52 | 81.48 | 70.15 | 96.26 | 74.36 |
55.07 | 58.09 | 59.42 | 43.22 | 60.64 | 81.90 | 69.56 | |
80.00 | 58.06 | 51.56 | 73.13 | ||||
|
|||||||
Week 3 | 62.84 | 84.62 | 80.52 | 58.02 | 70.15 | 88.77 | 51.28 |
69.57 | 32.35 | 91.30 | 75.42 | 74.47 | 84.76 | 47.83 | |
84.00 | 31.18 | 35.94 | 89.55 | ||||
|
|||||||
Week 4 | 81.76 | 96.15 | 74.68 | 62.96 | 86.57 | 97.87 | 82.05 |
88.41 | 84.56 | 82.61 | 74.58 | 65.96 | 53.30 | 59.42 | |
76.00 | 65.59 | 56.25 | 58.21 | ||||
|
|||||||
Average |
|
|
|
|
|
|
|
We find that the average degree of matching for all the days of the week is 72.71%. Globally, we find that this value is quite acceptable and that differences appearing on specific days are expectable since traffic patterns may suffer some changes depending on weather, special events, or other conditions.
The next step of our clustering procedure was to assess the feasibility of grouping those days of the week having the same number of clusters. With this purpose we tested all combinations and calculated the percentage of cluster matching for each pair of mean days of the week. The results are shown in Table
Percentages of cluster matching for average days of the week with same number of assigned clusters. Valid combinations are shown in boldface.
Combinations | Degree of matching (%) | Average matching (%) |
---|---|---|
Monday-Thursday | 57.43 | 48.29 |
8.69 | ||
66.67 | ||
|
||
Monday-Friday | 77.70 | 68.84 |
72.46 | ||
48.00 | ||
|
||
Monday-Sunday | 58.78 | 43.49 |
27.54 | ||
28.00 | ||
|
||
Thursday-Friday | 37.04 | 59.25 |
71.19 | ||
63.44 | ||
|
||
Thursday-Sunday | 30.86 | 41.78 |
55.93 | ||
33.33 | ||
|
||
Friday-Sunday | 61.94 | 51.37 |
36.17 | ||
51.56 | ||
|
||
Tuesday-Wednesday |
|
|
|
||
|
||
Tuesday-Saturday | 71.15 | 58.56 |
44.11 | ||
|
||
Wednesday-Saturday | 69.48 | 56.51 |
42.03 |
All combinations show an average degree of matching below 70%, except for the Tuesday-Wednesday combination which is close to 92%. Thus, we agree that these two weekdays can be combined as if they were a single day since similar patterns are obtained in terms of traffic variability throughout the day. Data shown earlier in Figure
To confirm that the grouping did not have a negative impact on the error associated with specific days, we now proceed to compare the degree of matching for the different clusters against the average day, the crossed average day, and the proposed union of both days. These results are shown in Table
Percentages of matching for the different clusters against the average day, the crossed average day, and the proposed union of both days.
Original average days | Crossed average days | Union of average days | ||||
---|---|---|---|---|---|---|
Tu | We | Tu | We | Tu | We | |
Week 1 | 92.31 | 84.42 | 90.26 | 86.54 | 86.83 | 86.23 |
84.56 | 62.32 | 81.16 | 65.44 | 84.00 | 69.60 | |
|
||||||
Week 2 | 89.74 | 80.52 | 89.61 | 78.21 | 85.63 | 77.25 |
58.09 | 59.42 | 57.25 | 58.09 | 56.80 | 60.00 | |
|
||||||
Week 3 | 84.62 | 80.52 | 86.36 | 78.85 | 82.04 | 72.46 |
32.35 | 91.30 | 34.06 | 90.44 | 30.40 | 88.00 | |
|
||||||
Week 4 | 96.15 | 74.68 | 95.45 | 73.08 | 92.22 | 68.26 |
84.56 | 82.61 | 82.61 | 81.62 | 86.40 | 80.00 | |
|
||||||
Average |
|
|
|
|
|
|
We find that the differences between the three cases are quite low. Specifically, the impact of grouping these two days into one is of only 1.6%, which is quite acceptable. The results using cross averages also strengthen the point of unifying these two days. As a result, by accounting for the number of clusters of each average day and by merging Tuesday and Wednesday into a single day, we obtain a total of 16 different traffic patterns.
In this section we present the normalized traffic patterns corresponding to the 16 clusters created: 3 for Monday, 2 for Tuesday/Wednesday, 3 for Thursday, 3 for Friday, 2 for Saturday, and 3 for Sunday.
As shown in Figure
Correlation between clusters (period between 7 a.m. and 9 p.m.).
Monday and Tuesday/Wednesday
Tuesday/Wednesday | |||
---|---|---|---|
Cluster number 1 | Cluster number 2 | ||
Monday |
Cluster number 1 |
|
0.578729 |
Cluster number 2 | 0.6229643 |
|
|
Cluster number 3 | 0.5900097 | 0.7910942 |
Thursday and Friday
Friday | ||||
---|---|---|---|---|
Cluster number 1 | Cluster number 2 | Cluster number 3 | ||
Thursday |
Cluster number 1 | 0.6741969 | 0.2552095 | 0.7292981 |
Cluster number 2 |
|
0.6691599 | 0.6666628 | |
Cluster number 3 | 0.7247197 | 0.7841533 | 0.8645128 |
Saturday and Sunday
Sunday | ||||
---|---|---|---|---|
Cluster number 1 | Cluster number 2 | Cluster number 3 | ||
Saturday |
Cluster number 1 | 0.8859214 | 0.8585393 |
|
Cluster number 2 | 0.8948805 | 0.8840648 | 0.7977545 |
Cluster description for the different average days considered.
Monday
Tuesday/Wednesday
Thursday
Friday
Saturday
Sunday
When comparing the daily pattern for the clusters of Monday against Tuesday/Wednesday (see Table
When comparing Thursday against Friday, we find that only Cluster number 2 for Thursday and Cluster number 1 for Friday present a high correlation (~94%).
Finally, when comparing Saturday against Sunday, we find that Cluster number 1 and Cluster number 3 present a good degree of matching (~94%), and these two clusters can also be represented through same daily pattern.
In this section we assess the benefits of our model in terms of the minimum number of patterns required to adequately describe traffic intensity throughout the day for the city of Valencia. Then, we detail how these different models obtained can be integrated in our traffic management platform to predict route costs. Finally we summarize our proposal by presenting the proposed heuristic in pseudocode format to allow generalizing the proposed procedure to any target city.
Below we discuss the different aggregation techniques that integrate our heuristic and the previous analysis.
Based on aforementioned aggregation proposals for the city of Valencia, in Table
Benefits of the proposed heuristic in terms of aggregation gain.
Target heuristic | Number of elements | Aggregation gains | Independent modeling domains |
---|---|---|---|
Monthly patterns per year | 12 | 12 : 3 | 3 |
|
|||
Daily patterns per month | 421 |
30 : 7 | 12 |
Traffic intensity analysis | 421 : 292 | ||
Street clustering | (292 |
||
Similar days clustering | 18 : 16 | ||
Daily pattern analysis | 16 : 12 | ||
|
|||
Total |
|
|
|
Overall, the proposed heuristic allows reducing the required number of interpolation functions for the city of Valencia by a factor of 4210 while maintaining the essence of time-dependent modeling requirements. Such a significant reduction certainly simplifies the integration of these models in our ABATIS platform and allows accelerating the associated calculations. This way, route decisions are taken in a centralized route server based on traffic states prediction throughout the day and for the different streets/avenues of a city, thus providing the most time-efficient routes.
The relationship between traffic flow levels and average travel speed is a well-known topic in traffic flow theory [
Relationship between vehicle flow level and vehicle speed.
As expected, average travel speed starts to decay when traffic density per lane increases beyond a certain threshold and becomes close to zero when approaching the maximum road capacity.
Since our models required a normalization of traffic levels of each street in order to perform model aggregation for similar patterns, given a street and an instant of time a vehicle is expected to enter the street, we show below the four steps involved in calculating the travel time for that street starting at the given instant of time. Note that, for simplicity, we do not put to the variables the subindexes corresponding to the given street and instant of time. Find the normalized traffic intensity (pattern) Obtain the expected traffic flow level Based on the average free-flow speed Calculate the travel time
Notice that, since the ABATIS platform is able to offer, among others, Traffic Management as a Service, it is able to serve optimal routes to clients. Currently, route costs are calculated using free-flow speeds. Thus, the proposed models can be integrated in the route calculation engine so that optimality conditions now account for the updated path costs using our predictive model. In addition, if the current status of the traffic flow is available in the future, it can be combined with the predicted value to further improve path cost accuracy.
Let
Algorithm
input: 3D array of traffic density per street, per hour, per day output: pattern-dependant cluster classification BEGIN for each street in All_streets do if (peak_traffic_intensity in remove street from All_streets
for each Week_day in WEEK_DAY do average_Week_day = get_average_pattern(Week_day) clusters mean_clusters = get_average(clusters[Week_day]) median_clusters = get_median(clusters[Week_day]) if (clusters[average_Week_day] == round(mean_clusters)) then num_clusters[week_day] = clusters[average_Week_day] else num_clusters[week_day] = round(get_average(mean_clusters, median_clusters, clusters[average_Week_day]))
for all week_day pairs ( where num_clusters[ if (Matching(cluster_elements( then pattern[
for all week_day pairs ( for all clusters if (correlation(average_street( then pattern[
RETURN cluster pattern classification END algorithm
Traffic management has evolved substantially in the last decades. Nowadays, traffic engineers require effective solutions to help them improve the traffic flow in cities, while minimizing travel times and tackling traffic-related problems such as CO2 emissions, noise, and accidents.
In this paper we define a procedure to obtain reliable traffic congestion estimations for all the streets/avenues in a city for the different times of the day and for every day in a year. Considering the modeling effort required, we proposed a heuristic that allows reducing the number of required interpolation functions characterizing daily traffic patterns.
By specifically addressing the city of Valencia, we made a detailed analysis of traffic behavior on the different streets/avenues of the city to determine (i) the behavior along the year, (ii) which days of the week show a similar pattern, (iii) which streets/avenues experience more traffic congestion, and (iv) how streets can be grouped into clusters based on their daily traffic pattern. The results of our analysis show that it is possible to model the traffic behavior in the city by aggregating elements with a similar behavior in the same interpolation function. This way, we will be able to account for the travel time variations along the main paths of a city, providing users with both optimized and accurate travel plans, while reducing the modeling complexity.
As future work we will develop a smartphone application that interacts with the ABATIS platform in order to obtain the most efficient routes, and we will implement a route planning algorithm that allows selecting these best paths while accounting for time-dependencies, FIFO restrictions, turn penalties, and so forth.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was partially supported by Valencia’s Traffic Management Department and by the “Ministerio de Economía y Competitividad, Programa Estatal de Investigación, Desarrollo e Innovación Orientada a los Retos de la Sociedad, Proyectos I+D+I 2014,” Spain, under Grant TEC2014-52690-R.