Trajectory Data Compression Algorithm Based on Motion State Changing

The trajectory information generated by the moving object plays an important role in studying the object movement. In this paper, a trajectory data compression algorithm based on the motion state changing is proposed to reduce trajectory data storage space and increase compression speed, which can accurately show the motion state and trajectory characteristics. This study has certain signiﬁcance for the exploration of mass traﬃc data and the planning of traﬃc network. Combining the angle threshold with the velocity threshold of a moving object, the key data points are found and the redundant information is removed. Subsequently, the compressed trajectory is obtained. The experimental results show that the new algorithm can help to improve compression eﬃciency. The compressed trajectory has high similarity with the original trajectory in movement tendency.


Introduction
With the development of economy and technology, mobile devices and global positioning system are popular in various industries [1,2]. In particular, data collection and storage of trajectory information with the characteristics of time, location, speed, and direction are showing high-speed growth. Currently, how to compress and dispose the GPS data is becoming a hot spot. In 1973, David Douglas and omas Peucker presented a classical Douglas-Peucker algorithm to preferentially delete some points by means of information loss in iterative calculations. Afterwards, this algorithm was improved by many scholars. Hershberger John implemented the Douglas-Peucker algorithm for line simplification [3], and Jin et al. studied near-linear time approximation algorithms for curve simplification by reducing the time complexity [4]. In addition, Keogh proposed the Opening Window (OPW) algorithm based on the same algorithm in which the trajectory is simplified by iterative information loss. is algorithm cannot track information of the whole iteration; it is based on the concept of "open window" in which the algorithm "window" contains only a portion of track point iteration.
en, it keeps updating the track information in the "window" until the whole track simplification is completed. is algorithm can be synchronized with online track compression [5]. e sliding window algorithm [6] is similar to the open window algorithm, the main idea of which is to start from the track start, initialize a sliding window size of 1, and gradually expand the window, thereby adding subsequent trajectory points. After connecting the first path point within the window and the last track point, the resulting segment is considered as segment approximation. Sliding window algorithm is used to calculate the approximate vertical line segment with the original track Euclidean distance; if the distance is less than the predetermined distance threshold, then continue to increase the sliding window. is process is repeated until the error is within the window. However, the abovementioned algorithm uses less track information using GPS time information. For this situation, Meratnia proposed a Top-Down Time Ratio algorithm [7]. is algorithm uses synchronous Euclidean distance (SEDm) in place of the vertical distance, which takes into account the time information from GPS track information. Coclite et al. [8] proposed using the road network of semantic information instead of track points to store compressed track objects and performed experimental studies in 2012. Yeh et al. used the oppressed road network link information in conjunction with information of the mobile objects' time to enter and leave the track [9]. is largely highly compressed the data storage. reshold algorithm is another type of algorithm proposed by Al-Hussaeni et al. [10]. is algorithm is based on the moving object speed and direction at locus points to predict binding region threshold set at a point that may exist. Location prediction is decided by retaining or removing the points. reshold algorithm deletes redundant indicator points based on speed and direction, and the algorithm takes into account the state trajectory, but the area to predict the next big point will possibly cause more points to delete. However, it can achieve a high compression ratio and has led to track trends and similar characteristics. Based on the above reasons, this paper constructs new indicators based on speed and angle, namely, the weighted combination of both thresholds to find the key points, and through statistical sampling to find the most appropriate threshold.

Compression Algorithm Based on
Angle Threshold e data used in this study are the taxi GPS positioning data of a Chinese megacity. e objective of the data compression algorithm is to show the trajectory form of taxi movement with the least GPS data.
Suppose there are n taxi tracks. X i represents i track, and each track has m i dots. A point of each track contains time, position, speed, and direction of rate in the GPS. X j i represents j point of i track, where j � 1, · · · , m i , and Due to fewer data points collected for each track, the data storage and calculating speed will be greatly affected. erefore, the track can be compressed to find key points and remove redundant data on the premise that there is no loss of essential feature track [11][12][13][14]. In order to reduce the data amount, it is necessary to find the retained or removed data point indicators. From an intuitive point of view, when the velocity of a track point of direction change is large, it indicates that the vehicle is traveling or changing its track due to road change or incidental situation [15].
us, this paper first considers the speed direction (angle) as a screening data point indicator.
θ α is defined as the angle threshold, which is used to eliminate the trajectory points which are less than the angle threshold, and the scope of angle threshold is greater than 5 and less than 20. Δθ According to this principle, from the first point of a track, successive points will be tested backwards to get a compressed trajectory. In order to reflect the effect of compression, the compression ratio is defined as follows: where Y θα i represents the compression rate of i track with angle threshold θ α and m i and m H i represent, respectively, the number of track points in i track before and after compression [16].
Angle threshold θ α is defined, respectively, as 5, 10, 15, and 20 in this paper. In nearly a month, running track (an example of the sample database shown in Table 1) of 2000 taxis in Beijing is verified, indicating that, with an increase in θ α , the average compression rate increases.
With the increase in the angle threshold, the change of the average compression rate of 2000 taxis is shown in Figure 1. It can be seen from Figure 1 that the average compression rate is more than 45% and the time efficiency is improved greatly. However, since only the information point of view is considered, some points with an important feature of the information have been deleted so that the original motion trajectory trend has changed; for example, the license plate number "669148" taxi track is shown in Figure 2.
In Figure 2, with the angle threshold values of 5, 10, 15, and 20, four figures are shown before and after compression of the taxi track; the red trace is the original track and the blue trace is the track after compression. It is obvious that the significant changes in movement trend between points A and B before and after compression have taken place as shown in Figure 2; the main reason is that the locus points between points A and B are all deleted. In order to avoid the deletion of some key data points, the compression ratio is usually reduced. erefore, we need to add a new threshold to filter the key data points that need to be retained. us, the information in the locus points contain the speed at different time periods; in addition, the size of the rate of change reflects whether there is traffic congestion or smooth, and it reveals the trajectory of internal features. If the speed is the only indicator, the situation with the angle threshold control of key points is bound to arise, which can get a higher compression ratio [17]. But it will lose some points that reflected track trends, and the characteristics of the track cannot be fully demonstrated by the rest of the points. For these reasons, this paper, used the two indicators angle and speed as the key points of track [18][19][20].

Compression Algorithm Based on Angle and Speed Threshold
Speed threshold is defined asv α , which is used to eliminate the trajectory points which are less than the speed threshold, and the scope of the speed threshold is greater than 5 and less than 20. Δv According to this principle, from the first point of a track, successive points will be tested backwards to get a compressed trajectory. e track is still referred to be X H i in the case not to cause confusion. Simultaneously, the compression ratio is defined as follows: where m i is as defined above and m H i is the number of compressed track points in the angle and speed threshold control.
As shown in Figure 3, track trends of A and B points before and after compression are well preserved in this method. In Figure 3, tracks are shown in the last small picture with grid before and after compression when θ α � 25 and v α � 30. In addition, two encircled points, respectively, in Figure 3 are points A and B in Figure 2. It is very intuitive that motion trends of point A and point B were retained well even in the larger threshold, but the compression ratio of this algorithm is lower. e average compression ratio of 2000 taxi track trends with the angle and speed threshold values at the same time is shown in Figure 4. us, the highest value of compression ratio is about 50% and the lowest value is only about 20%. We have paid more attention to keeping contrail features and being stringent to take data points so that a few points meet the speed and angle threshold. Besides, it leads to a situation that the similarity of the track shape is higher, but the compression rate is lower. When the angle and velocity threshold values change, the compression ratio also changes in size as shown in Figure 4.

Compression Algorithm of Trajectory Data Based on Motion State Change
Assuming a new index, the weighted threshold value for velocity and angle, so as to obtain higher compression ratio and retain important information after compression [21], the index is marked as θv α . Using the mathematical model established above, each data point of the vehicle trajectory data is screened in turn, and the key data points left by the screening are used to form the compressed vehicle trajectory. Without a disordered case, mark the compressed trajectory as X H i , and the   [22].
Reduced trajectory was not only considering efficiency, but it was also guaranteeing the compressed trajectory as well as the original trajectory. In order to reflect the similarity of the two trajectories before and after compression, the diversity of the two trajectories is defined. D i represents the absolute difference of i track before and after compression. Owing to the concern about kinetic trend changes before and after compression [23], area of the shadow that the compression trajectory and the original trajectory are surrounded by was not too large, as shown in Figure 5.
In Figure 5, the red line represents the point of original trajectory data and the blue line represents the point of     compressed trajectory data. If key points are selected properly, the shaded area would be as small as possible. Based on the above considerations, define where S i is the area of the original trajectory surrounding the horizontal axis and S H i is the area of the compressed trajectory surrounding the horizontal axis.
Because of the different lengths of the original trajectory, the difference between the two tracks is called relative diversity, to be called D i ′ � (D i /S i ). Taking θv α � 10, 15, 20, 25, and 30 and (α 1 , α 2 ) � (0.1, 0.9), (0.2, 0.8), (0.3, 0.7), (0.4, 0.6), . . ., (0.9, 0.1) as weights, the experiments of 2000 trajectories were run. According to various values, the average trajectory compression ratio was calculated and the average relative diversity was reckoned. e group that had a high average compression rate and a low average relative diversity as final thresholds and last weights was selected. Test results are shown in Table 2. In order to conveniently determine the threshold value and weights, choose by the size of P � 1 − the average compression ratio + (1 − the average relative diversity). As is shown in Table 1, when θv α � 30, P-values for different situations are maximum. So 30 and 0.6, 0.4 were determined as threshold value and weights. At the moment, the average compression ratio was greater than 71%, the average relative diversity was only 10%, and the similarity of the original trajectory and the compressed trajectory was approximately 90%.

Conclusions and Discussion
Exploring the law of vehicle trajectory data can uncover some important road network information, which can provide effective decisions and suggestions for reducing road congestion and planning of traffic routes. To explore the value of object trajectory information, we study a vehicle trajectory data compression algorithm based on the change of motion state. e advantage of the algorithm is that the vehicle trajectory data compression rate is high, and the vehicle motion state and trajectory characteristics can be displayed as accurately as possible. In the research process of the algorithm, we use the GPS data of taxi track to do a lot of experiments to explore the impact of speed threshold and angle threshold on the track data compression rate. According to the experimental results, we propose a threshold combination algorithm, which improves the data compression rate by changing the threshold parameters and makes the compressed data clearly show the characteristics of vehicle trajectory. Reference values are as follows: (1) Angle threshold is defined, respectively, as 5, 10, 15, and 20 in this paper. In nearly a month, running track of 2000 taxis in Beijing is verified, when the average compression rate is more than 45%. However, some points with an important feature of the information have been deleted so that the original motion trajectory trend has changed. Obviously, it is unscientific to only use the angle threshold to screen out the key data points in the vehicle trajectory; therefore, we need to add a new threshold to screen out the key data points that need to be retained. us, the information about the locus points contains the speed at different times; in addition, the size of the rate of change reflects whether there is traffic congestion or smooth, and it reveals the trajectory of internal features. If the speed is the only indicator, the situation with the angle threshold control of key points is bound to arise, which can get a higher compression ratio. But it will lose some points that reflected track trends, so that the characteristics of the track cannot be fully demonstrated by the rest of the points. (2) Angle threshold is defined, respectively, as 5, 10, 15, and 20, and speed threshold is defined, respectively, as 5, 10, 15, 20, and 25; the highest value of compression ratio is about 50%, and the lowest value is only about 20%. We have paid more attention to keeping contrail features and being stringent to take data points so that a few points meet the speed and angle threshold. Besides, it leads to a situation that the similarity of the track shape is higher, but the compression rate is lower. (3) Using speed threshold and angle threshold to set a new index so as to obtain higher compression ratio and retain important information after compression, the index setting and parameter selection can be obtained through a large number of experiments. Angle threshold and velocity threshold are controlled by parameters so that reduced trajectory was not only considering efficiency but also guaranteeing the compressed trajectory as well as the original trajectory. Finally, the vehicle trajectories before and after compression are presented and similarity analysis is carried out.

Data Availability
e data presented in this study are available upon request from the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.