Traffic State Estimation Using Connected Vehicles and Stationary Detectors

Real-time traffic state estimation is of importance for efficient trafficmanagement.This is especially the case for trafficmanagement systems that require fast detection of changes in the traffic conditions in order to apply an effective control measure. In this paper, we propose a method for estimating the traffic state and speed and density, by using connected vehicles combined with stationary detectors. The aim is to allow fast and accurate estimation of changes in the traffic conditions. The proposed method does only require information about the speed and the position of connected vehicles and canmake use of sparsely located stationary detectors to limit the dependence on the infrastructure equipment. An evaluation of the proposedmethod is carried out bymicroscopic traffic simulation.The traffic state estimated using the proposedmethod is compared to the true simulated traffic state. Further, the density estimates are compared to density estimates from one detector-based method, one combined method, and one connected-vehiclebased method. The results of the study show that the proposed method is a promising alternative for estimating the traffic state in traffic management applications.


Introduction
Density, speed, and flow are important measures for describing the characteristics of the traffic on a road segment.The density of traffic is defined as the number of vehicles located on a road segment.The speed can be defined either as the time mean speed, which is the mean speed of all vehicles passing a specific location within a given time interval, or as space mean speed, which is the mean speed of all vehicles travelling over a road segment at a certain point in time.Finally, the flow is defined as the number of vehicles passing a specific location within a given time interval.The three measures are commonly referred to as the traffic state.The traffic state is of special interest for traffic management systems based on automatic control.Examples of such systems are variable speed limit systems and ramp metering.The purpose of the traffic management system is to improve the traffic conditions during, for example, congested periods and incidents.The systems should be able to detect abrupt changes in the traffic state, such as lower speeds and flows and higher densities, and use this information as input to apply a suitable control strategy.Thus, representative real-time estimation of the traffic state is of importance.
The traditional way to measure and estimate the traffic state is by the use of stationary equipment, for example, loop detectors and radars.Due to recent development in vehicle technology, different types of connected vehicles are being introduced and the expectation is that in 2020 75% of newly produced vehicles will be equipped with technology that enables the possibility to connect to the surroundings [1].The connected vehicles facilitate communication between vehicles and between vehicles and the infrastructure.This allows for frequent updates of individual vehicle measures such as their speed and position.Hence, it is possible to use connected vehicles in combination with stationary detectors, or as a standalone data source, to estimate the traffic state.This can also result in improved spatial estimates instead of the traditionally used point estimates.
In this study, we propose a method for estimating the traffic state based on vehicle-to-infrastructure communication.The required information is speed and positioning measurements from connected vehicles in combination with counts from stationary detectors.By assuming that the connected vehicles have the same distribution of speed as regular vehicles, the speed is estimated as an average of the speeds of the connected vehicles.The only connected vehicle data needed to estimate the density is information about the current road segment of the connected vehicles.This makes the method robust with respect to errors in the positioning data.Each connected vehicle continuously communicates its location, which is used to estimate the total number of connected vehicles on a specific road segment.Further, the number of connected vehicles passing stationary detectors is, together with the total number of passing vehicles, used to estimate the penetration rate of connected vehicles.Thereby, the total number of vehicles located on a segment can be estimated.
Our hypotheses are that the proposed method will result in (1) density and speed estimates that can capture the current traffic conditions on the road, (2) a possibility to use more sparsely placed stationary detectors without considerably reducing the performance of the density estimation, given that the share of connected vehicles is assumed to be approximately the same over the road stretch, and (3) precise and fast detection of changes in the traffic state, especially for higher penetration rates of connected vehicles.In real applications, the traffic state estimation is one component of a larger model system, often including both data assimilation and fusion techniques.The aim of this paper is to find a straightforward approach to estimate the density and speed by the use of connected vehicles and investigate how well the traffic state estimation can capture the actual traffic situation as a first step towards using the proposed method as such a component in traffic management applications.
To study these hypotheses, the proposed method is evaluated by the use of microscopic traffic simulation.The density estimates are compared to one detector-based method, one combined method, and one connected-vehicle-based method.The comparisons are done in order to investigate if the proposed method gives estimates that are comparable to estimates of existing methods.A simulation scenario with an incident is analyzed to study how well the proposed method can capture abrupt changes in the traffic state.Two different distances between the detectors are applied to examine how the method performs with sparsely placed detectors.To isolate the effects related to the method, we use a simple design of the road network and assume that no measurement errors exist in detector and connected vehicle data.
The remainder of the paper is organized as follows.In Section 2, an overview of traffic state estimation methods is given with focus on density estimation.The proposed method for estimating the traffic state based on connected vehicles in combination with stationary detectors is presented in Section 3. The simulation setup and the evaluation method are described in Section 4, including an overview of the methods used for comparison.In Section 5, the performance of the proposed method is presented and compared to other methods.Finally, conclusions from the study and directions for further research are discussed in Section 6.

Traffic State Estimation Using Connected Vehicles
The traditional way to estimate the traffic state is by the use of stationary detectors such as loop detectors and radar detectors, as described by Kurkjian et al. [2], Coifman [3], and Singh and Li [4].This is limiting the estimation to specific points in space, and the conditions in between detectors remain unknown.Hence, the estimation will be a good representation of the traffic state on the road section only under steady-state conditions, that is, when there is no change in the traffic conditions in space and time.This is usually not the case, particularly not for bottlenecks or during incidents, and therefore the density estimated with these methods will most probably deviate from the true density on the section.Hence, to give enough information about the traffic conditions on a longer road section, the detector-based method does require densely placed detectors (see, e.g., the method proposed by Singh and Li [4]).Data assimilation and fusion techniques including a traffic model are common methods to get a picture of the traffic state also in between the detectors.A number of studies using different underlying traffic models and different filtering approaches exist in the literature (see, e.g., Kurkjian et al. [2], Muñoz et al. [5], Wang and Papageorgiou [6], Mihaylova et al. [7], Singh and Li [4], and Duret et al. [8]).Methods have also been proposed, where no underlying model and no filtering are needed.See, for example, Coifman [3], where reidentification of vehicles and the vehicle conservation law is used to estimate the density between two detectors.Darwish and Bakar [9] conclude that methods using different types of stationary detectors can estimate the traffic state accurately but they are often expensive to install and maintain and are limited to small areas.Also, the information is usually transmitted with delay, since it has to be processed through a traffic information center.Lately, when more data sources have become available, connected vehicles have been used as input to the filtering approaches in order to update the modeled traffic state.See, for example, Herrera and Bayen [10], Work et al. [11], Yuan et al. [12], Seo et al. [13], Astarita et al. [14], and Bekiaris-Liberis et al. [15].Other traffic state estimation methods making use of connected vehicle data without an underlying traffic model are presented by Herring et al. [16], Herrera et al. [17], Van Lint and Hoogendoorn [18], Qiu et al. [19], Ma et al. [20], Bhaskar et al. [21], Zhang et al. [22], Seo et al. [23], and Montero et al. [24].
When speed measurements from connected vehicles are available, the speed can be estimated by calculating an average of the speeds of the connected vehicles.This requires that the connected vehicles have the same distribution of speeds as regular vehicles, similar to what has been done in the works of Astarita et al. [14] and Bekiaris-Liberis et al. [15].Otherwise, the speed estimate would be biased towards the average speed of the connected vehicles.
The density estimate using connected vehicles requires some more calculations.One way of estimating the density is by using connected vehicles together with traditional stationary detectors, here referred to as combined methods.For the combined methods, a weighted estimate based on both traditional detector measurements and connected vehicle measurements of, for example, speed, travel time, and/or location is used.Examples are the methods presented by Astarita et al. [14], Qiu et al. [19], Ma et al. [20], Bhaskar et al. [21], Zhang et al. [22], and Bekiaris-Liberis et al. [15].The method by Qiu et al. [19], which was later extended by Ma et al. [20], detects the number of vehicles located within a segment.The density estimate is calculated by counting the number of vehicles that have passed the detector upstream of the segment at the times when a connected vehicle enters and exits a segment.According to Qiu et al. [19], the accuracy of the density estimates is better than when only stationary detector data is used to estimate density.Zhang et al. [22] use probe vehicle data and detector stations in order to estimate the space mean speed, and not density, on a road stretch.Astarita et al. [14] and Bekiaris-Liberis et al. [15] develop macroscopic cell transmission type models for the dynamics of the percentage of connected vehicles along the considered road.It is assumed that the connected vehicles move with the same average speed as the nonconnected vehicles, and hence no modeling of the speed dynamics is needed.In the work of Astarita et al. [14], the density is estimated by the percentage of connected vehicles based on counts of connected vehicles moving from one segment to another in the network and inflows measured at the ramps and at the boundaries of the network.Similarly, Bekiaris-Liberis et al. [15] use the penetration rate of connected vehicles, together with measurements of speed from the connected vehicles, the boundary flow, and the ramp flows measured through stationary detectors, to estimate the density.Also the combined methods do often require densely placed detectors to get good estimates.The methods are often based on retrospective measurements, such as travel time at an earlier point in time (see, e.g., Qiu et al. [19] and Ma et al. [20]).As a result, the density estimate might not reflect the current situation.
For the connected-vehicle-based methods, the connected vehicles are used to capture the surrounding traffic conditions at every point in space, and the methods are therefore not limited to fixed locations.Recent studies by Seo et al. [13,23] investigate how the gap to a leading vehicle can be used to estimate the density on a road segment.The same method has been applied to an urban area in Montero et al. [24].Further, the method is extended to also include measurements of the gap to the following vehicle.Another method is making use of vehicle spacings and speed as input data for estimating density [12].This method requires a numerical model of the relationship between the traffic states to describe the changes in speed, flow, and density.Finally, Seo and Kusakabe [25] propose a method based on the number of vehicles located in between two connected vehicles to estimate the traffic conditions on the road.For methods using only connected vehicle data, the measurements used to estimate density are local, only including the connected vehicle and its surroundings, which might not necessarily reflect the density on a larger section of the road.Also, the methods often require identification of current lane, speed of the vehicle, distance to vehicle in front, and so forth.
To conclude, detector-based density estimation techniques make use of measurements from stationary detectors, usually consisting of flow and speed, to estimate density.However, the density estimates using stationary detectors are based on point estimates and are therefore limited due to the fact that the conditions in between detectors are not known.Therefore, the methods require densely spaced detectors to give density estimates that correspond well with the traffic conditions on the road.The detector-based method can be improved by including connected vehicle data.For combined methods, the need for continuous updates from connected vehicles can be limited; that is, it is enough with low frequency data communicated from the connected vehicles.However, the combined methods do usually still require densely spaced detectors and the information is sometimes based on retrospective connected vehicle measurements.The density estimates using only connected vehicle data are based on local density estimates including precise estimates of the density surrounding the connected vehicle.The connected vehicle density estimates are transmitted for further processing and converted to a density estimate of a larger area by including estimates from many connected vehicles.Hence, a representative density estimate can often only be reached with a high connected vehicle penetration rate or for a high flow level.Further, the connected vehicles are assumed to be able to continuously transmit information about their location, speed, gap to proceeding vehicle, and so forth.

A Method for Estimating the Traffic State by Using Connected Vehicle and Detector Data
We propose a combined method for estimating the traffic state on the road.The connected vehicle data is based on vehicle-to-infrastructure communication.The method is straightforward and it is possible to estimate the speed and density accurately based on limited information from the connected vehicles.Measurements from sparsely placed stationary detectors can be used without reducing the performance of the estimates.The purpose of the method is to get fast and representative traffic state estimates that can also be used to identify changes in the traffic conditions.Before introducing the method, a few essential assumptions are given: (i) The connected vehicles are able to report their position, including information about their current road segment and speed, with a frequency of 1 Hz.(ii) The stationary detectors are able to count and report the total number of vehicles, , and the total number of connected vehicles, , passing the detector within a given aggregation time period, .(iii) The connected vehicles are assumed to have the same distribution of speeds as the nonconnected vehicles.
Hence, neither the equipment used for communication of information for the connected vehicle nor the type of stationary detector (radar, Bluetooth, loop, etc.) is defined and may vary as long as they fulfil the requirements presented above.
The traffic state estimation consists of two parts, a speed estimate and a density estimate for each segment on the considered road stretch.The speed estimates are based on simple calculations.It is assumed that the connected vehicles have the same distribution of speeds as the nonconnected vehicles.By communication of the individual speed of each connected vehicle  located on segment  at time , the average speed of connected vehicles, V   (), can be calculated.Then, the average speeds at time  are averaged over the aggregation time period, , to get the final speed estimate, where   is the total number of connected vehicles on road segment .
The density estimates make use of the position of each connected vehicle to get the total number of connected vehicles at each road segment, , and for each time step, .The average number of connected vehicles,   (), in segment  and for the aggregation time period, , is used to estimate the total density.The stationary detector data is used to estimate the penetration rate on segment  as the number of connected vehicles,   , divided by the total number of vehicles,   .The penetration rate at the detector station located just upstream of segment  is used as input for the density estimate at segment .The density estimate becomes The new speed and density estimates are becoming available after the latest aggregation time period , based on the measurements within the same aggregation time period and, hence, the estimates are varying with time.The temporal indices in the density estimates have been suppressed to increase readability.The method is hereafter referred to as the Count Connected Vehicle (CCV) method.By assuming that the penetration rate is constant over a longer road section, detectors can be sparsely placed in order to have as little requirements on detectors as possible.In this case, each estimate of the penetration rate is applied to many segments before a new estimate of the penetration rate becomes available.
The information from the connected vehicles and the detectors is collected at each time step and communicated to a central unit, where it is being processed, resulting in time-dependent speed and density estimates.Finally, the estimates are aggregated over the aggregation time period . Figure 1 gives an illustration of the process.The local units are the individual connected vehicles and the detectors.The central unit can, for example, be a roadside unit or a traffic management center used for further processing of data.

Evaluation Method
The analysis is divided into three parts.First, CCV is evaluated with respect to the aggregation time period, , where the aggregation time period is defined as the time interval over which the traffic state is estimated.The performance of the traffic state estimation is examined for four different time periods.This will indicate for which aggregation time periods the method gives traffic state estimates that are representative for the traffic conditions on the road.Both density and speed estimates are considered.Second, the density estimates of the CCV are compared to similar methods: one method using only stationary detector data, one combined method, and one method using only connected vehicle data.The comparison is used to evaluate if the density estimates of the proposed method are comparable to those of detector-based, combined, and connected-vehicle-based methods found in the literature.Finally, it is investigated how well CCV manages to detect changes in the traffic states at different distances between detectors.Two cases are included with detectors placed 500 and 2500 meters away from each other.The results are compared to another combined method.This section describes the methods used for comparison, the simulation setup in a microscopic traffic simulation environment, including the choice of parameters, and the performance indicators used for evaluation.

Comparison of the Density Estimates.
The density estimates are somewhat more complex to calculate compared to the speed estimates.Hence, methods with the same level of complexity are chosen for comparison.However, the methods require different types of data to estimate the density and include one stationary-detector-based method, SD, one combined method, CC, [19,20], and one connected-vehiclebased method, GAP [23].The required measurements for the different density estimation methods are summarized in Table 1.The measurements are divided into stationarydetector-based measurements and connected-vehicle-based measurements.The methods used for comparison with CCV are described below.Central unit: processing of data (2) Detector raw data: (ii) M k (t): number of connected vehicles (i) N k (t): number of vehicles (1) Connected vehicle raw data: A Detector-Based Method: the Stationary Detector (SD) Method.The detector-based method is using speed and flow measurements to estimate the density by the fundamental relationship where   and V  are the mean flow and harmonic mean speed, respectively, detected at a detector station located just upstream of segment  and for the aggregation time period .The method is hereafter referred to as stationary detector (SD) method.An alternative to using flow and speed measurements is to use the detector occupancy, that is, the amount of time a detector is occupied by vehicles.
The occupancy can then be translated to density based on an estimate of the vehicle length.However, also by using occupancy, the resulting density will be represented only at a specific point.Further, the length of the vehicles has to be estimated or assumed.
A Combined Method: the Cumulative Count (CC) Method.
The combined method is presented by Qiu et al. [19] and later extended by Ma et al. [20] and here is referred to as the Cumulative Count (CC) method.The method detects the number of vehicles located within segment .Let (   ) be the number of vehicles that have been detected up until time    , where    is the time connected vehicle  enters segment .The density estimate for connected vehicle  is given by counting the number of vehicles, (   ) − (  +1 ), which have passed the detector upstream of the segment at the time when a connected vehicle  enters,    , and exits,   +1 , the segment.The resulting density estimate becomes Here, the total number of connected vehicles,   , which exit segment  within the aggregation time period  is used to get an average density estimate.This means that the travel time for each connected vehicle within the segment can be longer than the aggregation time period.Further, no vehicles are assumed to overtake the connected vehicle.This means that, on a segment with more than one lane, overtaking will result in a deviation from the actual density.
A Connected-Vehicle-Based Method: the GAP Method.The connected-vehicle-based method is presented by Seo et al. [23] and is referred to as the GAP method.In this method, connected vehicles are assumed to measure and communicate the gap to its leader, the position on the road, and time of the measurement This means that the trajectory of a connected vehicle, and its leader, within the segment has to be recorded in order to estimate the density.It should be noted that the measurement range of the gap between a connected vehicle and its leader has to be considered.If the gap is assumed to be measured through local on-board equipment, longer gaps might not be included due to a limited measurement range.This will lead larger gaps to be excluded from the density estimations, and as a result, the uncertainty in the estimations will increase.Further, the gap behind the connected vehicle is not included, which is of importance at platooning, since the gap between the connected vehicle and the vehicle behind can be long and maybe even not in the same segment.Since this gap is excluded, the density estimate becomes higher than the actual density on the road.Also, the frequency at which the connected vehicle data is transmitted is important.Between each time of communication for the connected vehicles, a linear relation in measured data is assumed.This can result in discretization errors that are larger at lower transmission frequencies.

Simulation Setup.
The proposed method, as well as two of the comparison methods, requires identification and communication of information from connected vehicles.Microscopic traffic simulators describe individual vehicles in the traffic stream, allowing for gathering of information from single vehicles within the simulation.Thus, microscopic traffic simulation is suitable for analysis of the proposed method.In this study, we use the open-source microscopic traffic simulation tool SUMO (version 0.27.1)[26,27].SUMO is multimodal, space continuous, and time discrete.The car-following model used to model vehicle interactions is a further development of the work by Krauß [28] and is based on the calculation of a safe speed, compared to the approach of Gipps [29].The lane-changing model is rule-based.The core model in SUMO is further described by Krajzewicz [30].The connected vehicle and stationary detector data are accessed during the simulation through SUMO's Traffic  Control Interface (TraCI).Python scripts [31] are used to implement the traffic state estimation methods.
The simulated scenario consists of a one-directional two-lane motorway, divided into ten 500-meter segments.Further, a segment for loading of vehicles and an end segment are included to avoid boundary effects, resulting in a 6 km long simulated road.The maximum allowed speed on the road is assumed to be 100 km/h.Simulations are performed with a flow pattern taken from flow measurements on a twolane motorway in Stockholm during afternoon peak hours on a normal weekday (see Figure 2).The simulation is performed for a period of 3.75 hours, excluding a warm-up period of 5 minutes to prevent from loading effects.Further, eight different connected vehicle penetration rates are investigated.For the investigation of different distances between detectors, abrupt changes in the traffic state are required.Changes in the traffic state are modeled as an incident by letting one vehicle halt on a road segment for ten minutes after one hour.This is resulting in a temporary drop in capacity, which is considerably changing the traffic conditions.

Vehicle Parameters.
The data used for calibration is collected through stationary radar detectors from four different locations on a two-lane urban motorway in Stockholm.The measurement from each detector consists of speed and flow measurements averaged over 15 minutes from three days in April 2016, with a typical pattern and no larger incidents reported.The flow profile is presented in Figure 2. The speed measurements, averaged over the three days, at the different detector locations and for different time instants are given in Figure 3(a).However, the original road stretch includes some on-and off-ramps, which have been excluded in this simulation study.The reason for this is to isolate the effects of the method by using a simple simulation scenario before trying more complex scenarios.Additionally, vehicle data at free flow conditions for four different vehicle classes are available for calibration of the desired speed distribution.The composition of vehicles and the speed distribution for each vehicle class are given in Table 2.
The calibration of a SUMO model of the two-lane motorway in Stockholm has resulted in calibrated vehicle parameters that are applied for this study.Figure 3 gives an overview of the speed at the four detectors and at   15 min intervals for the measurements (a), the uncalibrated simulated measurements in SUMO (b), and the calibrated measurements in SUMO (c).The speed ranges from 0 km/h (red) to 110 km/h (blue).The default parameters and the calibrated parameters in SUMO are given in Table 3.The lane-changing parameters, lcCooperative and lcSpeedGain, have been changed substantially compared to the default values in SUMO.The reason for this is that by using the default values the throughput becomes much higher than observed on the motorway in Stockholm (see Figure 3(b)).Here, lcCooperative parameter controls the degree of cooperation with other vehicles when performing a lane-change.A lower value results in decreased cooperation and thereby a decreased capacity on the road.The lcSpeedGain parameter is related to the willingness to increase speed in order to perform a lane-change.Hence, by increasing this parameter, the willingness to perform a lane-change is increased.This results in increased interaction between the vehicles due to the increased number of lane-changes, which in the end reduces the capacity on the road.Also, the willingness to keep right, controlled by the parameter lcKeepRight, has been increased for trucks and buses, corresponding to classes 3 and 4. The reason for this is that when the willingness to perform a lane-change is increased, the trucks and buses will also perform lane-changes more frequently, which is not the case in reality.The adjustments of the lane-changing parameters result in a throughput for the simulated scenario that is comparable to the available detector measurements.
In the car-following model, the acceleration and deceleration abilities, as well as reaction time and the driver imperfection, have been adjusted to correspond to the actual capacity on the road.From Figure 3(c), it can be concluded that, by applying the final calibrated parameters, the simulated scenario is able to reproduce the measured scenario in terms of mean speed levels.
Vehicles are generated with exponentially distributed headways.The connected vehicles are also assumed to be generated with exponentially distributed headways and uniformly distributed in the total flow and between vehicle types.The connected vehicle data is collected and transmitted with a frequency of 1 Hz.The connected vehicles are able to detect a vehicle in front at a maximum distance of 1500 meters.

Performance Indicators.
During the simulation, the number of vehicles within each segment and at each time step is counted and averaged over the aggregation time period to get the "true" simulated density and speed, hereafter referred to as the reference density.The reference density is compared to the density estimated using the proposed method, CCV, and the methods used for comparison (SD, CC, and GAP).The true simulated space mean speed is calculated by averaging of the speeds of all vehicles located within a segment for the considered aggregation time period.The comparison of both density and speed estimates is done with respect to the Root Mean Square Error (RMSE) for the total number of observations : where est  and obs  are the estimate and the reference value for observation , respectively.The estimate and the reference value are either speed or density.The difference between the estimated and the observed values is calculated for each segment and each aggregation time period and summarized to get the RMSE.
The RMSE is an aggregate performance measure of the estimations based on the total number of observations and it will not therefore capture how changes in the traffic conditions are reflected in the density and speed estimates.Therefore, an evaluation of how well the density estimates of CCV capture the changes in the traffic conditions is done by examining the estimated density and the reference density in a time-space diagram for an incident scenario.Further, a comparison of the time-space diagram of the density estimates for CCV and the other combined method, CC, is done.The means and standard errors of the means are calculated based on 10 replications of the simulation for the different scenarios.

Results
In this section, results from the simulation experiments are given.First, the aggregation time period is examined.This is followed by an investigation of the performance of CCV compared to SD, CC, and GAP.Finally, the ability for CCV to capture changes in the traffic conditions is presented.

Aggregation Time Period for CCV.
The aggregation time period has to be chosen carefully.For a short aggregation time period, changes in the density can be discovered fast.On the other hand, the estimations become more uncertain due to a limited amount of data to base the estimation upon, especially for lower penetration rates.For a large aggregation time period, the traffic state estimation becomes more stable, but there is also a risk to smooth out changes and thereby miss useful and relevant information.The performance of CCV for the aggregation time periods of 15, 30, 60, and 120 seconds is examined by comparing the estimated density and speed to the reference density and speed, as explained in Section 4.4.
At low penetration rates and for short aggregation time periods, it is often not possible to estimate the traffic state due to no connected vehicle measurements to base the estimation on.In this case, the missing estimates are excluded from the RMSE and the resulting RMSE becomes uncertain.To give an indication of the amount of missing estimates at different penetration rates, the mean percentage of missing estimates per segment is given in Figure 4.The mean RMSE for speed and density for different penetration rates is shown in Figure 5.
From Figure 5(a), it is observed that the aggregation time period has limited effect on the RMSE of speed.Further, the RMSE of speed is highest for low penetration rates and decreases for higher penetration rates.However, this is the case only when it can be assumed that the connected vehicles have the same speed distribution as nonconnected vehicles.If the distribution of speed for the connected vehicles deviates from the distribution of speed of nonconnected vehicles, the proposed method for estimating the speed will be biased towards the speed of the connected vehicles.This will result in a higher or lower speed estimate depending on whether the connected vehicles are assumed to drive faster or slower than nonconnected vehicles.
Further, it is observed that the aggregation time period will have a great effect on the density estimates.As can be seen in Figure 4, a low penetration rate of connected vehicles will result in a large percentage of missing measurements.As a result, no, or only uncertain, traffic state estimates are available.This does explain the increase in RMSE with increasing penetration rates observed in Figure 5(b) for low penetration rates and for the aggregation time periods of 15-60 seconds.Since nonvalue estimates are excluded from the resulting RMSE at low penetration rates, an increase in penetration rate leads more observations to become available and more, but uncertain, estimates contribute to a larger RMSE.However, at penetration rates above 5-20%, depending on the aggregation time period, enough measurements are included in the estimation to give a reliable result and the RMSE is starting to decrease with increased penetration rate of connected vehicles.When the penetration rate is 100%, the RMSE of both speed and density are, as expected, zero due to the exclusion of measurement errors.
As a conclusion, the RMSE of speed and density are smallest for the aggregation time period of 120 seconds.However, by using such a large aggregation time period, the estimates become smoothed over a longer period.This might result in the fact that important phenomena in the traffic conditions are detected late or are even not detected at all.In order to reduce the uncertainty in the estimations and at the same time be able to capture the changes in the traffic flow, which is important for identifying changes in the traffic conditions, an aggregation time period of 60 seconds is therefore chosen for further investigations.

Performance of CCV.
The density estimates of the CCV are compared to the density estimates of GAP, CC, and SD in order to investigate how the proposed method performs compared to existing methods.The mean RMSE of density for the four methods and using different penetration rates are presented in Figure 6.The aggregation time period is set to 60 seconds based on the results presented in Section 5.1.
As concluded in Section 5.1, the CCV method does not show a clear relationship between penetration rate and performance for penetration rates below 10% as a result of the missing or limited measurements to base the estimations on.Hence, for a penetration rate of connected vehicles of 1 and 5%, the RMSE of density for the CCV is not trustworthy.However, since the CCV performs poorly at low penetration rates as shown in Figure 6, it can be concluded that the penetration rate must be higher than 10% to give reliable estimates.
The density estimates based on GAP, CC, and CCV are, as expected, improved with an increased penetration rate.The CCV gives density estimates comparable to GAP at a penetration rate of 20%.It is first at a penetration rate of 40-50%; the accuracy in the estimates of CCV is comparable to estimates of CC and SD, although after a penetration rate of 40%, the improvements in the density estimates are rapid with increased penetration rate.When approaching a penetration rate of 100%, the estimated density is close to the reference density, unlike the other methods where a larger difference still exists.The reason for this is that, for CCV, a penetration rate of 100% means identifying all vehicles located within the segment and this is expected to be the same as the reference density over the same aggregation time period.
The other combined method, CC, has a larger RMSE of density at higher penetration rates.One reason for this might be the fact that it is assumed that no vehicles are overtaking the connected vehicle.Therefore, when the connected vehicles arrive at the downstream detector, all vehicles arriving at the upstream detector after the connected vehicle are assumed to still be within the segment.But since the segment consists of two lanes and the vehicles within the simulation have different desired speed, overtaking will occur, especially for connected vehicles with a low desired speed.
By using only connected vehicles, as is done in GAP, the local nature of the method does not seem to capture the traffic conditions in a larger area during inhomogeneous traffic conditions where the speed distribution of the vehicles becomes wider.By only considering the gap to the vehicle in front, the formation of platoons, with small gaps for the vehicles within the platoons and larger gaps in between platoons, will decrease the performance and there will be a bias towards gaps within the platoons.This is especially observed at medium flow levels where the left lane is mostly used for overtaking.In this case, a leader is sometimes missing as a result of the limited measurement range or the time-space distance behind the platoon is unknown in the overtaking lane where the traffic flow is more inhomogeneous.This is why the RMSE for the GAP method does not decrease as much as the other methods at higher penetration rates.
Finally, it is concluded that the density estimates using SD are, as expected, independent of the penetration rate.The difference between the true density and the density estimates for SD is due to the fact that the density estimates for SD are based on measurements at specific locations and not measurements for the whole segment.
As a conclusion, the SD is stable and gives reliable estimates of the density independent of the connected vehicle penetration rate and flow level.For the combined methods, the CC is preferable at lower penetration rates, while the CCV is more accurate at higher penetration rates.Results from GAP show that using only local measurements from connected vehicles gives a lower accuracy in the density estimates under inhomogeneous traffic conditions as a result of the inability to measure the gap behind vehicles and the limited measurement range.Since the accuracy of the density estimates is higher for the combined methods and the detector-based method, it is preferable to include stationary detectors when estimating the density.

Evaluation of Effects of Different Distances between
Detectors.One of the advantages of CCV is that it can capture changes in the traffic conditions in between detectors, meaning that even though the detectors are sparsely placed, smaller segments in between the detectors can be considered for estimating the density.However, this is based on the assumption that the connected vehicle penetration rate is approximately constant in between detectors.For this reason, CCV is compared to the other combined method, CC, to investigate how the ability to detect changes in the traffic conditions, in this case modeled as an incident, is affected by sparsely located detectors for the two methods.Since CC has to include the whole stretch between two detectors as one segment, it is expected that the resulting performance is improved for the CCV.
Two time-space diagrams of density are presented in Figure 7.The distances between detectors are 500 and 2500 meters, respectively.An aggregation time period of 60 seconds is used based on the investigation in Section 5.1.A penetration rate of 40% is used, since CCV and CC are comparable in accuracy at this level.The presented results are means over 10 simulation runs.As can be seen in Figure 7(a), both CCV and CC manage to capture the tail of the incident with a distance between the detectors of 500 meters.However, both methods seem to overestimate the density at the incident and underestimate the density in the tail of the incident.An explanation to the overestimation for the CC is that vehicles might overtake the connected vehicle at an incident and pass to the next segment before the connected vehicle has exited the considered lane.This will result in a too large density estimate.Further, the connected vehicle data is sometimes based on retrospective information and information smoothed out over a longer time period than the aggregation time period, which will reduce the performance of CC especially during congested conditions, such as an incident.One reason for the deviation in the estimation for CCV might be due to an uncertain estimate of the penetration rates caused by the low speed levels at the detectors.This is further investigated below.
With more sparsely placed detectors, as in Figure 7(b), the same behaviour is seen for CCV, since the segment length is still 500 meters and the penetration rate in this example is approximately the same over the whole stretch.However, for CC, the distance between detectors and thereby the distance over which the density is estimated is considerably increased, which will result in decreased accuracy of the density estimates.Further, a consequence of the more sparsely located detectors is an even longer time delay in the estimates.As a result, the incident is detected late when using the CC method and as a smoothed density estimate over the five segments without detectors.
It is concluded that CCV seems to capture changes in the traffic conditions, such as an incident.A benefit of CCV is that the estimates are based on the current situation on the road.This is becoming important during incidents and where the longer travel times seem to cause delays in the density estimation for CC.A second benefit is that, under the condition that the connected vehicle penetration rate can be assumed to be constant over a longer road stretch, as is the case in this study, CCV manages to capture changes in the traffic conditions also for longer distances between detectors.This is seen when the distance is increased to 2500 meters instead of 500 meters, whereas the results from CC are affected by even increased delay, as well as smoothing, resulting in the fact that the incident is detected late and as a smoothed density over five segments.However, it should be noted that CC does greatly limit the dependence of communication equipment, since reporting is only necessary at specific locations.Hence, when communication of connected vehicle data can only be done at specific points in time, or at specific locations, combined density estimation methods not dependent on continuously transmission of connected vehicle data are preferable over CCV.
The Effect of Penetration Rates at Incidents for the CCV.From Section 5.3, it is concluded that CCV overestimates the density at incidents and especially in the case of densely placed detectors.One reason for this could be an incorrect estimate of the penetration rate close to the incident.Therefore, a reference penetration rate, gathered by counting the number of vehicles and the number of connected vehicles on each segment in the simulation, is compared to the estimates from the CCV for the simulated incident scenario.Figure 8 gives an overview of the estimated and the reference penetration rate over time at the segments close to the incident.
From the figure, it becomes clear that the reference penetration rate and the estimated penetration rate deviate more during the incident and for the affected segments.This confirms that the estimate of the penetration rate is the reason for the uncertain estimates.It seems like, in order for the model to perform well under low-to-medium penetration levels, the traffic condition within a segment has to be homogeneous.Hence, if the segment can be subdivided to one congested and one uncongested traffic state, the penetration rate between those states might differ, resulting in an inaccurate estimate of the penetration rate and as a result an inaccurate estimate of the density.Further, the estimates of the penetration rate at detector intervals of 500 meters are more unstable close to the incident.This is probably due to the fact that the congestion is moving upstream, resulting in slow moving vehicles and a more uncertain number of connected vehicles passing the detectors, which creates local variations in the connected vehicle distribution in space.This corresponds well with the larger error in the density estimates at the incident for more densely placed detectors, which can be seen in Figure 7(a).For that reason, a more sophisticated method for estimating the penetration rate would probably increase the performance of CCV even at low penetration rates.Also the division of the road into smaller segments can reduce the possibility of having different traffic conditions within one segment.In this case, the penetration rate still needs to be estimated individually at each segment to increase the performance, since the inaccurately estimated penetration rate is the main problem.

Conclusions
We propose the Count Connected Vehicle (CCV) method for estimating the traffic state using connected vehicles in combination with stationary detectors.The method provides a straightforward approach to estimate the speed and density based on vehicle-to-infrastructure communication.The purpose of the method is to get fast and accurate traffic state estimations that can be used to detect changes in the traffic conditions and at the same time limit the dependence on detailed measurements communicated from the connected vehicles using as few stationary detectors as possible.The only measurements required from the stationary detectors are the number of passing vehicles and the number of passing connected vehicles.Hence, the complexity of the stationary detectors is limited.Moreover, if the penetration rate is assumed to be approximately the same over a longer road stretch, the stationary detectors can be sparsely placed.The limited dependence on measurements from the connected vehicles is due to the fact that the requirement in precision in position data can be low.The only requirement is that the current segment be correctly reported.Hence, the measurement errors for estimating the number of connected vehicles located on a segment are limited to the boundaries of each segment.
The method is evaluated by the means of microscopic traffic simulation.The speed and density estimates of the proposed method are compared to the true simulated values.Further, the density estimates are compared to the density estimates using one detector-based method, one combined method, and one connected-vehicle-based method.The results of the study show that the proposed method is a promising alternative for accurately estimating density and speed on the road, especially at medium-to-high penetration rates of connected vehicles.Since the method is based on realtime positioning data from connected vehicles, it can capture abrupt changes in the traffic conditions, such as incidents.The traffic state can be estimated instantaneously given that there are at least a few connected vehicles at the segment.Hence, the aggregation time period can be short.This makes the method useful for traffic management purposes.Note that the penetration rate can be based on previous knowledge, that is, a moving average, during short aggregation time periods, to increase the accuracy in the estimation of the connected vehicle penetration rate.
Many interesting topics are identified for future research.First, a low connected vehicle penetration rate and a short aggregation time period are concluded to result in few or sometimes missing density estimates.Hence, by including data assimilation techniques, such as the ones described by Evensen [32], Wang and Papageorgiou [6], Antoniou et al. [33], and Seo et al. [13], the performance of the proposed method can be improved.Second, the penetration rate becomes inaccurate during inhomogeneous traffic conditions within a segment.By using a more advanced method to estimate the penetration rate, the accuracy of the method can be improved.For example, the dynamics of the penetration rate can be modeled using a macroscopic model as in Astarita et al. [14] and Bekiaris-Liberis et al. [15].Third, the effect on the traffic state estimation for more complex designs of the road network and inclusion of measurements errors has to be further investigated.Finally, the use of the proposed method in traffic management systems is also a topic for future research.
Local units: raw data collection Density and speed for segment k (i) Average speed of connected vehicles at (ii) Number of connected vehicles at segment k, segment k, Average density and speed for segment k (i) Estimated speed at segment k for aggregation time period T, aggregation time period T, (ii) Estimated density at segment k for

Figure 1 :
Figure 1: The communication flow between the local units (connected vehicles and detectors) and the central unit for the aggregation time period .

Figure 2 :
Figure 2: Inflow profile for the simulated scenario.The inflow profile is corresponding to the peak hours on a two-lane urban motorway in Stockholm.

Figure 3 :
Figure 3: The mean speed for the given detector measurements (a), the uncalibrated scenario (b), and the calibrated scenario (c).The detectors are represented on -axis and the time instants are represented on -axis (15 min intervals).The mean speed is ranging from 0 km/h (red) to 110 km/h (blue).

Figure 4 :
Figure 4: Mean percentage of missing estimates per segment for CCV at different penetration rates.The error bars are 95% confidence assuming normally distributed missing number of estimations over the simulation runs.

Figure 5 :Figure 6 :
Figure 5: Mean RMSE for the estimated speed (a) and density (b) compared to the reference density and speed for CCV at different penetration rates and for different aggregation time periods.The error bars in (b) are based on 95% confidence intervals, assuming normally distributed RMSE over the simulation runs.The standard error of mean in (a) is at most 0.44 km/h, and hence the confidence intervals are excluded from the figure for greater legibility.

Figure 7 :
Figure 7: Reference density and density estimates using CC and CCV in time (min) and space (km) for a simulated incident scenario.The aggregation time period is 60 sec and the detector interval is 500 meters and 2500 meters in (a) and (b), respectively.The color map shows density (veh/km).

Figure 8 :
Figure8: The reference (black) and the estimated penetration rate for segments 6-8 with a detector distance of 500 meters (red) and 2500 meters (blue).

Table 1 :
Required measurements for estimating density based on CCV method, SD method, CC method, and GAP method.

Table 2 :
Free flow speed distribution for the different vehicle classes.

Table 3 :
Vehicle parameters based on the calibrated scenario.