Road Traffic Monitoring System Based on Mobile Devices and Bluetooth Low Energy Beacons

The paper proposes a method, which utilizes mobile devices (smartphones) and Bluetooth beacons, to detect passing vehicles and recognize their classes. The traffic monitoring tasks are performed by analyzing strength of radio signal received by mobile devices frombeacons that are placed on opposite sides of a road.This approach is suitable for crowd sourcing applications aimed at reducing travel time, congestion, and emissions. Advantages of the introduced method were demonstrated during experimental evaluation in real-traffic conditions. Results of the experimental evaluation confirm that the proposed solution is effective in detecting three classes of vehicles (personal cars, semitrucks, and trucks). Extensive experiments were conducted to test different classification approaches and data aggregation methods. In comparison with state-of-the-art RSSI-based vehicle detection methods, higher accuracy was achieved by introducing a dedicated ensemble of random forest classifiers with majority voting.


Introduction
Road traffic is a complex phenomenon, where various entities (pedestrians, cars, trucks, busses, tramps, bicycles, etc.) interact one each other, when using common infrastructure. The traffic management and control, due to infrastructure constraints and rising number of vehicles, is a complex task and requires application of dedicated algorithms together with precise traffic data (both historical and current) [1]. The information about number of vehicles and their types is helpful in reducing travel times and emissions [2]. Precise traffic data allows us not only to increase effectiveness of traffic control, but also to adapt management policy to changing conditions and predict infrastructure bottlenecks [3].
The precise traffic data can be provided by traffic monitoring systems that are usually integrated with road infrastructure. Such systems allow detecting and classifying the vehicles in selected areas by using data from sensors (inductive loops, video-detectors, magnetometers, etc.) [4]. A major drawback of the solutions integrated with infrastructure is a low flexibility and significant maintenance cost. To overcome these drawbacks, applications of new technologies (e.g., wireless sensor networks) in traffic monitoring are considered [5]. Such solutions can facilitate installation and reconfiguration of the system. However, the cost is still significant.
Thus, in this paper an alternative method was proposed, which was inspired by the crowd sourcing approaches and utilizes iBeacon techniques for vehicle detection and classification. Crowd sourcing [6] is a distributed model, in which a crowd solves or helps to solve a complex problem. Crowd sourcing utilizes mobile workforce and unique features, which could be found in smartphones. Smartphones offer a great platform for extending existing applications due to multisensing capabilities: geolocation, audio, and visual sensors. They could be used to provide precise data about current traffic at given location. In contrast to the approximation models proposed in [7], where mobile device is situated inside a vehicle, this paper proposes a new system with mobile devices (smartphones) and beacons situated by the road. In order to detect vehicles, the proposed system measures signal strength of frames received from Bluetooth beacons.

Related Works and Contribution
Smartphones become the round-the-clock interface between user and the environment, which integrates the Internet network (via WiFi, 2G/3G/4G/5G) with local-area networks (e.g., Bluetooth, new generation NFC, or Portable WiFi, which allows the smartphone to act as a router and share the cellular connection with nearby devices) [6]. It is worth noting that each of these communication standards is characterized by different energy consumption and data transfer parameters [8]. Smartphone devices possess powerful computational capabilities and are equipped with various functional built-in sensors [9] that have enabled the development of mobile sensing technologies [10][11][12][13][14]. Among them, crowd sensing [12] plays important role due to the possibility of collecting useful data. The crowd sensing approach utilizes large amounts of participants to monitor the surrounding environment by means of various sensors: accelerometer, gyroscope, compass, microphone, camera, GPS, and wireless network interfaces.
The mobile sensing technologies were used for the development of noise detection [14], social behavior monitoring [15], health monitoring of disabled patients [16], and indoor localization [17,18]. Another example is an accurate and energy-efficient smartphone-based traffic lane detection system for vehicles, which can detect different lane-level landmarks with accuracy above 90% [19]. Several solutions were also proposed for road traffic monitoring [20,21]. These solutions provide the GPS localization data for vehicle tracking. They require the mobile device to be present in vehicle; thus not all vehicles can be tracked in this way.
In this paper a method is proposed, which allows the smartphones placed in road surrounding (e.g., on sidewalks in pedestrians' pockets) to be used for traffic monitoring. According to the introduced method, vehicle detection and classification is performed by analyzing strength of radio signal received from Bluetooth beacons.
Up to date, several efforts have been made to explore the possibility of vehicles detection and localization via channel state information (CSI) [22], received signal strength indicator (RSSI) [23,24], link quality indicator (LQI), and packet loss rate [25].
A method, which uses wireless transmission to detect road traffic congestion, was proposed in [25]. This method requires a pair of wireless transmitter and receiver. The transmitter continuously sends packets. The receiver, which is placed on opposite side of a road, evaluates RSSI, LQI, and packet loss metrics. It was shown that these metrics enable recognition between free-flow and congested traffic states with high accuracy. The method was implemented and tested with use of ZigBee motes.
Similar ZigBee network was adapted in [24] for vehicles detection. The experimental results presented in that work confirm that a vehicle passing between the network nodes causes a drop of RSSI value. It was also observed that the gradient of RSSI drop depends on the vehicle speed.
In [26] a method was introduced for vehicle detection and speed estimation, which is based on RSSI analysis in network composed of two WiFi access points and two WiFiequipped laptops. Mean value and variance of RSSI measurements were used to discriminate between three states: empty road, stopped vehicle, and moving vehicle. The experimental results reported in [26] show that variance of RSSI decreases with increasing speed of vehicle. This dependency was used for speed estimation.
Another WiFi-based traffic monitoring system was presented in [22]. This system utilizes single access point and one laptop to provide functionalities of vehicle detection, classification, lane identification, and speed estimation. According to that approach, CSI patterns in WiFi network are captured and analyzed to perform the traffic monitoring tasks. The CSI characterizes signal strengths and phases of separate WiFi subcarriers.
In [23] a radio-based approach for vehicle detection and classification was introduced, which combines ray tracing simulations, machine learning, and RSSI measurements. The authors have suggested that different types of vehicles have specific RSSI fingerprints. This fact was used to perform a machine-based vehicle classification. The RSSI values were analyzed in a wireless network of three transmitting and three receiving units, which were positioned on opposite sides of a road. The six wireless units were mounted on delineator posts and equipped with directional antennas. It was demonstrated that such system is able to detect vehicles and categorize them into two classes (passenger car and truck). It was also demonstrated that traffic lanes in a two-lane road have different distributions of CSI data. This fact was utilized to identify in which lane a vehicle is detected.
The wireless networks have been also used for detection of parked vehicles. In order to detect the parked vehicles, the transmitting nodes are placed on parking space and the receiving nodes are installed at a high location. When a vehicle is parked over the transmitting node, a decrease of the RSSI value is registered. Thus, the vehicles can be easily detected based on simple RSSI analysis. Different systems of this type were implemented with use of CC1101 wireless communication modules [27] and XBee motes [28].
The above-discussed methods from the literature are not suitable for the crowd sourcing applications, as they require energy-expensive data transfers (WiFi) or specialized hardware (ZigBee modules, directional antennas). The new approach proposed in this paper utilizes the Bluetooth low energy (BLE) communication, which is commonly available in smartphones. According to the introduced approach, BLE beacons are used with iBeacon protocol [29] to broadcast data frames. The beacon frames are registered by smartphones that collect the RSSI measurements, aggregate them, and send to a server for further analysis. It should be noted here that the BLE beacons are cheap battery-powered devices that can work for a long time (years) without battery replacement or charging. Moreover, the use of BLE communication significantly extends the lifetime of smartphone battery in comparison to WiFi transmission [30]. Nevertheless, beacon discovery has a significant impact on smartphone battery usage; thus the discovery time interval should be planned carefully. The application of BLE communication for RSSIbased vehicle detection and classification has not been considered previously by other authors. This study involves detailed verification of the above-mentioned solution in realtraffic conditions.
Another important drawback of the existing methods lies in limited accuracy of vehicle detection and classification. To overcome this drawback, a new ensemble of classifiers was designed in this study, which accurately detects vehicles and recognizes three vehicle classes based on RSSI data collected from multiple smartphones.
The existing methods utilize single classifiers to detect vehicle and recognize its class. In the related works, the RSSIbased vehicle classification was implemented with use of various classification methods: artificial neural networks [22], k-Nearest Neighbor (k-NN), support vector machine (SVM) [23], decision trees [31], and logistic regression [32]. A SVM method was adopted in [23] to train vehicle classification models and categorize vehicles into two classes (passenger car and truck). The state-of-the-art algorithms are trained using raw data [23] or a set of predefined features [31,32]. To the best authors' knowledge, classifier ensembles have not been previously adapted to deal with the RSSI-based road traffic monitoring tasks.
In machine learning literature, various ensemble methods are presented, which combine several classifier systems that use different models or datasets [33]. Several bootstrapping methods were considered (bagging or boosting), which allows us to optimize classifier ensembles [34] or merge classifier decision [35]. Research in [36,37] shows that combined classifier can outperform the best individual classifier under some conditions (e.g., majority voting by a group of independent classifiers). That works have motivated the approach described in this paper, which involves design and verification of classifier ensembles for traffic monitoring with use of the RSSI data. In comparison with the stateof-the-art methods that are based on single classifiers, the proposed approach enabled more accurate vehicle detection and classification.

Proposed Method
The proposed vehicle detection and classification system utilizes RSSI data collected by mobile devices (e.g., smartphones) in a predetermined region on the side of the road. Mobile devices measure signal strength when receiving radio frames from BLE beacons across the street. The RSSI values together with information about position of the device are transmitted to a server, which performs data aggregation and classification.
Structure of the proposed traffic monitoring system is presented in Figure 1. It should be noted that the introduced system structure, which includes BLE beacons and mobile devices, has not been considered in the literature. The BLE beacons are installed at different heights because such arrangement is suitable for vehicle classification, i.e., recognition of personal cars, semitrucks, and trucks [32]. Beacons use the iBeacon protocol [29] to broadcast frames. The mobile devices on the opposite side of the road use BLE communication to collect incoming beacon frames and evaluate their RSSI. Position of the device can be determined based on both the RSSI information and the GPS signal. The collected data are transmitted to a server via cellular network or WiFi communication.
According to the iBeacon protocol, three fields in the broadcasted frames are available that identify the sending beacon: UUID (universally unique identifier), Mayor, and Minor value. UUID contains 32 hexadecimal digits, split into 5 groups, separated by hyphens. The iBeacon standard requires also Mayor and Minor value to be assigned. Those two values help to identify beacons with greater accuracy than using the UUID alone. The Minor and Major values are unsigned integers between 0 and 65535. The purpose of the UUID is to distinguish beacons in a given network from beacons in other networks. For instance, the same UUID can be used for all beacons in a traffic monitoring system, which covers many detection areas. Major values are intended to identify a group of beacons; e.g., all beacons in a certain detection area can be assigned a unique Major value. Finally, Minor values are intended to distinguish an individual beacon. The Minor value can be used for distinguishing individual beacons installed at different heights within a detection area. In this paper, the 3-tuple of UUID, Major, and Minor fields is referred to as beacon ID.
In this study new algorithms (Algorithms 1-5) were designed and implemented to enable accurate vehicle detection and classification with use of BLE beacons and mobile devices. Details of the operations performed by mobile device are presented in Algorithm 1. The received beacon frames are ignored if the RSSI is below a predetermined threshold. In the opposite situation a new data record is created and written to a buffer. The data record contains information about frame reception time, device position, ID of frame sender (beacon), and RSSI value. The content of the buffer is periodically sent to the server. Frequency of these data transfers is controlled by parameter . It should be noted that the beacon frames collection and data transfer to server can be performed in parallel if appropriate hardware solution is available.
The objective of server operations (Algorithm 2) is to recognize event type based on the data records delivered from mobile devices. The event type determines if the monitored road section was empty or a car was present in this section during transmission of beacon frames. Additionally, the type of the event indicates class of detected vehicle (personal car, semitruck, or truck). According to the proposed method, the type of the event is recognized using a classifier ensemble (Algorithm 4).
Before execution of the classification procedure, the input data are aggregated. The proposed aggregation procedure is based on so-called sliding window concept [38] (Algorithm 3). It means that if a new data record is received, which contains RSSI value for time t, then the aggregation operation refers to a collection of data records for which the frame reception time satisfies condition t - is size of the time window. Such collection of data records is used to calculate aggregates (statistics) of RSSI values, i.e., minimum, maximum, average, and standard deviation. Separate aggregates are determined for each pair of the transmitter (beacon) position and the receiver (mobile device) position. The positions of beacons do not change; thus they are identified by the beacon ID. In contrast, current position of mobile device is assigned to the nearest reference  Details of the proposed data aggregation procedure are presented by the pseudocode in Algorithm 3. For the sake of simplicity, it was assumed in this pseudocode that only two statistics are to be calculated (maximum and minimum). In practical applications the number of statistics has to be larger, as discussed in Section 4. The symbols min refPos bID and max refPos bID in Algorithm 3 denote the minimum and maximum RSSI value determined for frames sent from beacon bID and received by a mobile device close to reference position refPos in time window [t -, t]. The statement that a mobile device is close to a reference position means that its distance to the reference position is below d max. It should be noted that d max is set to be lower than half of the minimum distance between reference positions; thus each mobile device is assigned to single reference position. The number of reference positions and the number of beacons in Algorithm 3 are denoted by and n, respectively.
As it was already mentioned above in this section, the type of the event (which relates to vehicle presence and class) is recognized based on the aggregated RSSI data, by using a classifier ensemble (Algorithm 4). The proposed ensemble consists of classifiers that are fed with various subsets of the aggregated data. A different set of the reference positions, for which the RSSI data are collected, is assigned to each classifier in the ensemble. Hereinafter, this set will be referred to as the classifier range. The reference positions are identified by natural numbers 1,. . ., m. Thus, the classifier range can be defined by a pair [a, b], where 1 ≤ a ≤ m, and a ≤ b ≤ m. The range [ , ] means that the input dataset of the corresponding classifier includes the aggregates (e.g., min refPos bID and max refPos bID) that were determined for the reference positions refPos = a,. . ., b. In case of range [1, ] the classifier utilizes the complete dataset. On the other hand, the classifier's input dataset includes the RSSI readings for only one reference position when a =b.
For each classifier in the ensemble a weight is determined, which corresponds to number of the classifier's votes. The total number of votes for a given event type is calculated by adding the weights of the classifiers that have recognized this particular event type. As a result, the event type, which receives the highest total number of votes, is selected. In case of a tie the class which has higher a priori probability is selected. Weights of the classifiers are adjusted during training procedure with use of the evolutionary strategy [39].
In this study, application of various machine learning algorithms was considered for implementation of the proposed ensemble (support vector machines, random forest, probabilistic neural network, and k-nearest neighbors' algorithm) [31,40]. A separate training dataset, which includes classes (i.e., event types) determined by human observer, was used to train the classifiers.
After the events are recognized, an update of the vehicles classification and detection results is conducted in accordance with Algorithm 5. This update is necessary because the new results can be related to time moments for which some events have already been recognized. The new results are  more credible as they take into account additional, recently collected data. Thus, the previous results are deleted. Finally, the table Events includes the information about event type for all time points covered by the available RSSI dataset. It should be also noted that in this study four event types are considered (empty road, presence of personal car, semitruck, and truck)

Experimental Results
Usefulness of the proposed vehicle detection and classification method was verified during experiments in real-world traffic conditions. A schema of the test site, as well as distances between reference positions and beacons, is presented in Figure 2. Three BLE beacons were installed on road side at height of 50, 100, and 200 centimeters above the road surface. This configuration was selected, as providing the most promising results, on the basis of preliminary tests [32]. On the opposite side of the road, four reference points were determined in equal distances of 4 meters. In this area the RSSI measurements were conducted using four smartphones Redmi 3S held at a height of about 1 meter near to the reference positions. The data were collected in a period of two hours. During that period, more than 400 vehicles have passed through the analyzed road section. A mobile application was developed to enable effective collection of the experimental data ( Figure 3). Additional mobile devices were used by observers to record the events related to presence of vehicles in front of the reference locations, with recognition of three vehicle classes (personal car, semitruck, and truck). All the mobile devices were synchronized via NTP protocol. Examples of collected records for two different reference positions are presented in Figure 4. The vertical red lines in Figure 4 show the time instances when passing vehicles were registered by the observers. The labels below vertical lines denote class of the vehicles. These results show that the vehicles cause visible changes of RSSI for both locations. Moreover, the signal noise increases with distance between beacons and mobile device (Figure 4(a)).
For the experimental purposes, the collected data were divided into training and test datasets. The experiments were conducted to evaluate the accuracy of automatic vehicle classification based on the collected data, with use of different machine learning algorithms, i.e., support vector machines (SVM), random forest (RF), probabilistic neural network (PNN), and k-nearest neighbors' algorithm (KNN).
The SVM algorithm [41] performs classification tasks by using hyperplanes defined in a multidimensional space. The hyperplanes that separate training data points with different class labels are constructed at the training phase. SVM employs an iterative training procedure to find the optimal hyperplanes having the largest distance to the nearest training data point of any class. The larger distance results in lower generalization error of the classifier.
In case of RF classifier [42], the training procedure creates a set of decision trees from randomly selected subset of training data. Each tree performs the classification independently and "votes" for the selected class. Finally, the votes from different decision trees are aggregated to decide the class of a test object. At this step, the RF algorithm chooses the class having the majority of votes from particular decision trees.
PNN [43] includes three layers of neurons (input layer, hidden layer, and output layer). The neurons in hidden layer determine similarity between test input vector and the training vectors. To evaluate this similarity each hidden neuron uses a Gaussian function, which is centered on a training vector. The hidden neurons are collected into groups: one group for each of the classes. There is also one neuron in the output layer for each class. The output neuron calculates class probability on the basis of values received from all hidden neurons in a given group. As a result, the posterior probability is evaluated for all considered classes. The final decision of the classifier is the class with maximum probability.
KNN algorithm [44] computes distances between the test data point and all training data points in feature space.  Afterwards, k training data points with the lowest distances are selected as the nearest neighbors. The test data point is assigned to the class, which is most common among the knearest neighbors. During experiments the classification accuracy was compared for several RSSI-based traffic monitoring approaches, including the proposed solution and the state-of-the-art methods from the literature. This comparison takes into account the method with one receiver [25], solutions with multiple, spatially distributed receivers and single classifier, which detects the vehicles based on a complete RSSI dataset [22,23], and the new introduced algorithm with the ensemble of classifiers.
Initial experiments were conducted to calibrate parameters of the algorithms. In these experiments, vehicle classification was performed with use of 8 aggregates (minimum, maximum, difference between max. and min., mean, standard deviation, median, Pearson correlation coefficient, and number of received frames). The aggregates were calculated based on the RSSI data collected in four reference positions, in accordance with Algorithm 3.
Accuracy of the KNN algorithm was tested for parameter k (number of the nearest neighbors) in range between 1 and 20. Results of the tests are presented in Figure 5. Based on these results the value k = 7, which gave the highest classification accuracy, was selected for further experiments. Figure 6 shows the classification accuracy that was achieved by using the RF algorithm with different number of decision trees. It can be observed in these results that the accuracy does not change significantly for the number of decision trees above 5. However, the accuracy achieved for the tree number between 6 and 9 was slightly lower than for the RF with 10 trees. A little decrease of the accuracy was also observed for the tree number above 10. Therefore, during experiments described later in this section the number of decision trees was set to 10. It should be also noted that the complexity of the algorithm increases when using a larger set of the decision trees.
The impact of the window size on vehicle classification accuracy was also examined during the preliminary experiments. The window size was changed from 1 to 6 seconds with steps of 1 second. As shown in Figure 7 for RF and KNN algorithms, the best results were obtained when using the window size of 3 seconds. In case of larger windows the classification accuracy decreases because the data registered for multiple vehicles are aggregated in one window. Similar results were also observed for SVM and PNN algorithms. Thus, the 3-second window was used in further experiments. At the next step, the most effective set of attributes was selected with use of the backward elimination method. Results of the elimination for the RF algorithm are presented in Figure 8. At the beginning, the classification accuracy was tested using full dataset with 8 aggregates. The result of this test is shown by the leftmost bar in Figure 8. Next tests were performed for the 8 datasets that were created by removing particular aggregates (attributes). As shown in Figure 8, an improvement of the vehicle classification accuracy was achieved after deletion of the "difference" attribute (i.e., the difference between maximum and minimum). Thus, the reduced dataset includes 7 aggregates: minimum, maximum, mean, standard deviation, median, Pearson correlation coefficient, and number of received frames. Further elimination did not improve the results. It was verified that the deletion of the "difference" attribute is beneficial for all considered classification algorithms. Table 1 shows the vehicle detection and classification accuracy obtained for the basic approach, which takes into account the signal strength measured by a single device [25] (in one reference position). These results were obtained after the above-discussed initial search of the best algorithm parameters. As it was already mentioned in previous section, in case of the vehicle classification task four classes of events are considered: empty road, presence of personal car, semitruck, and truck. For the vehicle detection problem two classes are taken into account: empty road and presence of a vehicle. The accuracy (ACC) was calculated as overall accuracy, using the following formula: where n is number of classes, C i is number of items (events) in the test dataset that are correctly assigned to ith class (event type), and D is number of items in test dataset. It should be also noted that the results in Table 1 are presented for the two classification algorithms that provide the best accuracy. These results firmly show that the most  accurate vehicle classification and detection was possible when the mobile device is placed opposite the beacons location (in reference position 4). The results confirm observation that noise in RSSI readings increases with the distance from beacons to mobile device. It should be also noted that the number of RSSI samples that are collected when a vehicle is present between beacons and mobile device decreases with the speed of the vehicle. As a result, lower accuracy is observed for higher speed of vehicles. In the considered test site, the vehicles were slowing down when passing the reference position 1 since this position was close to a crossroad. Thus, the accuracy obtained for reference position 1 is higher than for reference positions 2 and 3.
In further tests, the other approach was considered, which is based on application of multiple receivers and one classifier [22,23]. According to this approach the vehicles were recognized by single classifier, using the dataset collected in four reference positions. Results of these experiments are shown in Table 2. The classification accuracy (ACC) and Cohen's kappa [45] (CK) is compared in Table 2 for all considered classification algorithms and various sizes of the sliding window. When comparing the results in Table 2 with those in Table 1 it can be observed that the RSSI data collected by multiple devices in several locations along the road enable more accurate vehicle classification. Similar experiments were also conducted for the vehicle detection task and the accuracy of 0.935 was achieved.
The results in Table 2 firmly show that size of the sliding window has a significant impact on the accuracy of vehicle detection and classification. Passing vehicles cause a drop in RSSI level. This drop is longer for trucks and shorter for personal cars. In order to correctly recognize the vehicle, the sliding window has to cover the time when RSSI values are reduced. If the sliding window is to narrow, the lower RSSI values may be registered in entire window for different vehicle classes and thus the classes cannot be correctly recognized. If single classifier is used, a wider window is also helpful because the drop of RSSI is shifted in time for different reference locations. However, in case of an excessive window size, two successive vehicles can be captured in one window, which results in decreased accuracy of the detection and classification. The best result results were obtained by using the random forest classifier with window size of 3 seconds.
The next step of the research was aimed at increasing the accuracy of vehicle detection by using the proposed classifier ensemble in combination with majority voting, as described in Section 3. It should be noted that the proposed method was used with time step of 1 second and d max = 1 meter. During the tests of the ensemble, different ranges of individual member classifiers were taken into account (see Table 3). The input data of individual classifiers were obtained not only from particular reference positions (e.g., Classifier 1 in Ensemble no. 1), but also from a connection of the neighboring positions (e.g., Classifier 1 in Ensemble no. 3). When analyzing the results presented in Table 3 it can be observed that the highest accuracy was achieved for the ensembles of the random forest classifiers. The best ensemble (no. 5) combines the classifiers that are fed with data from two neighboring reference positions (Classifiers 1-3) with the classifier created for reference position 4 (Classifier 4) and the classifier, which utilizes the entire dataset (Classifier 5). Classifier 4 with range [4,4] was included in the ensemble as it provides the best accuracy when using data from single reference position. The high accuracy was also obtained for Ensembles no. 2 and 6. Results of these ensembles are only slightly worse than those for Ensemble no. 5. This fact shows that the proposed approach achieves high vehicle classification and detection accuracy by combining local classifiers (that utilize data from two neighboring reference positions or single reference position) with the global classifier (which makes decisions based on data collected in all reference positions).
It was noted that the random forest algorithm was about 8.5% more effective than KNN. The proposed method  achieves the accuracy above 97% for vehicle detection task and above 94% in case of the vehicle classification task. It means that the introduced classifier ensemble provides better results than the state-of-the-art methods that utilize individual classifiers (see Tables 1 and 2).
Results obtained for the best classifier ensembles and for the individual (single) classifiers are compared in Figures  9 and 10. The box plots show minimum, first quartile, median, third quartile, and maximum of the accuracy values for 30 tests. For each test different training and testing datasets were selected from the measurement data. In these results significant differences of the accuracy are visible when comparing the single classifiers with their ensemble counterparts. Similarly, the accuracy differences are significant when comparing the RF classifiers with KNN classifiers. It should be also noted that the accuracies achieved by the best RF ensembles do not differ significantly. Thus, selection among these ensembles should be considered as a tuning of the proposed method.
The higher accuracy of RF ensemble can be explained by the fact that the RF algorithm has several features, which enable effective training of the classifier. According to this algorithm, all decision trees in the forest are created by  using randomly selected subsets of the training dataset. The random selection applies to both the events (rows) and the aggregates (columns). Each decision tree further divides the training data into smaller subsets until the subsets are small or all events in these subsets belong to one class. In contrast to RF, the other compared algorithms (including KNN) perform the training procedures with use of the complete training dataset.

Conclusions
The proposed vehicle detection and classification approach uses mobile devices (smartphones) and Bluetooth beacons for road traffic monitoring. It allows detecting three classes of vehicles by analyzing strength of radio signal received from BLE beacons that are installed at different heights by the road. This approach is suitable for crowd sourcing applications aimed at reducing travel time, congestion, and emissions. Advantages of the introduced method were demonstrated during experimental evaluation in real-traffic conditions. Extensive experiments were conducted to test different classification approaches and data aggregation methods. In comparison with state-of-the-art RSSI-based vehicle detection methods, higher accuracy was achieved by introducing a dedicated ensemble of random forest classifiers with majority voting.
The presented solution can be extended to several beacons installed along the road to obtain information concerning vehicle velocity and direction. Another interesting topic is related to data preprocessing on mobile devices in order to reduce the communication effort. Finally, additional studies will be necessary to introduce methods that can be used to activate the Bluetooth modules and beacons when it is necessary and reduce the energy consumption.

Data Availability
The data used to support the findings of this study are included within the supplementary information file.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.