An Improved Fuzzy Trajectory Clustering Method for Exploring Urban Travel Patterns

,


Introduction
Travel patterns can be explored by analyzing the travel characteristics of moving objects (vehicles and humans), which reflects peoples' travel regularity, traffic congestion regularity, and social activity pattern. Travel patterns have been applied in many areas, for instance, providing the decision information for urban planning and emergency [1][2][3][4], analyzing and optimizing the path to provide personalized travel recommendations for residents [5][6][7][8], vehicles dispatching [9,10], and station optimization and selection [11]. ese applications can prove insights for urban construction and development.
Nowadays, more and more researchers and scholars explore urban travel patterns using large-scale trajectory data, which contain huge hidden information about travel feature and regularity. Several approaches have been used for this application. (1) Clustering methods: trajectory clustering has been gaining increasing interest in recent years, and it generally requires two components, similarity measurement and clustering algorithm [12][13][14][15]. (2) Spatial statistics: Ni et al. [16] employed a spatial econometric model for travel flow analysis to explore factors that influence travel demand; Zhang et al. [17] proposed a Bayesian hierarchical approach for modeling the destination choice behavior considering the unavailable factors and spatio-temporal correlations; Kamruzzaman et al. [18] estimated the effects of urban form and spatial biases on residential mobility. (3) Deep learning: especially nonparametric deep learning methods have proven to give more accurate predictions in urban traffic forecast [19,20]. (4) Classical traffic theory models: some classical models have been used in travel patterns analysis. For instance, using combined Markov chain and multinomial logit model [21] to improve the accuracy of travel destination prediction and using traffic assignment models [22] for O-D matrix estimation.
In the abovementioned methods, the clustering method is an unsupervised learning process that classifies datasets on the basis of similarity. By defining a reasonable similarity criterion, the clustering method can effectively excavate hidden information from massive trajectory data and thus reveal travel patterns. Some researchers adopt traditional clustering methods for travel patterns analysis, such as Cmeans [23], shared nearest neighbor clustering [24], DBSCAN [25,26], hierarchical clustering method [27,28], and optics algorithm [29,30]. Besides, there are also some researchers who propose several modified clustering methods to obtain better clustering results: a modified bee colony optimization (MBCO) [31] is proposed which introduces the approach based on probability selection; a hybrid model-fused k-means and fuzzy c-means clustering with the modified cluster centroid (FKMFCM-MCC) [32]; Wang et al. [33] presented a modified find density peaks (MFDP) algorithm to transform the high-dimensional points into two-dimensional, and it expressed good potential for application. e literatures mentioned above have provided practical applications of travel patterns analysis, and several classical or modified clustering methods have been proved that they can achieve good clustering results to explore residents' travel patterns. However, there are also some limitations need to be improved in trajectory clustering for travel patterns analysis. Firstly, as for similarity measurements, lots of literatures only consider the spatial attribute of trajectories, and fewer literatures consider the multiattributes (such as temporal, directional, and other characteristics) of trajectories, ignoring other attributes will cause inaccurate results and unreasonable travel patterns. Besides, many researchers who consider multiattributes do not assign weights to different attributes. e configuration and discussion of weights is necessary because different attributes have different influences on trajectory clustering. Secondly, trajectory data have the attribute of faint borders and overlapping borders; thus, the efficient algorithm which has soft constrains is required to handle such dataset, while lots of researchers consider the ideology of hard divided clustering. Finally, many works pay attention on improving clustering algorithm to reduce computational complexity and obtain better results, and there are lacks in using multiple indices to the quantitative evaluation of results.
Fuzzy clustering methods have the ability to identify clusters with variable density distributions and partially overlapping borders [34]. In fuzzy clustering methods, the framework of the fuzzy set theory is defined with the aim to make up for the shortage of crisp clustering algorithms. Generally, the clustering methods realize soft constraints by introducing weight entropy, membership function, or fuzzy distance function. Currently, the fuzzy clustering methods have been applied in many areas, such as predictive models: Seresht et al. [35] proposed a fuzzy clustering algorithm and assign weights to the fuzzy inference systems to improve the accuracy of predictive models; virus research: Mahmoudi et al. [36] used fuzzy clustering technique to compare the spread of COVID-19 in many countries; image segmentation: Wu and Chen [37] proposed a novel fuzzy clustering method to improve the robustness of the exiting picture clustering. Moreover, some internal evaluation indices are adopted for trajectory data which are unlabeled in clustering results analysis: silhouette coefficient [38,39]; Davies-Bouldin index [40]; Calinski-Harabasz index [41]; Q-Measure [42]; Dunn's index [39,43]. e main contributions of this paper include the following several aspects. Firstly, an improved fuzzy clustering method based on DBSCAN is proposed to cluster taxi trajectory data, in which fuzzy theory is introduced to defining membership functions. It is different from traditional DBSCAN algorithm and can effectively deal with taxi trajectory data which have the attributes of faint borders and overlapping borders. e greatest advantage of the proposed method is introducing the theory of soft constrain, which can divide trajectories into reasonable clusters. While traditional DBSCAN uses hard constrain, when a trajectory may belong to either cluster 1 or cluster 2, the algorithm divides it into first cluster, and this is unreasonable. Secondly, we define the trajectory distance considering the combination of spatial, temporal, and directional attributes to measure the similarity between trajectories. Besides, the weights of different attributes are optimized in trajectory distance function to obtain better results. Finally, several internal evaluation indices are used in model comparison, and the results show the effectiveness and advantage of the proposed method comparing with other classical clustering methods. e remainder of this paper is organized as follows. Section 2 introduces the framework and theory of the proposed method. In Section 3, data description and case results' analysis of Shenzhen city are introduced. In Section 4, the evaluation among compared approaches is conducted, and the results are then discussed. In Section 5, we summarize the conclusion of this paper.

Framework.
e proposed method for taxi trajectory clustering in this paper is shown in Figure 1 as a framework. e framework mainly contains three aspects: (1) after preprocessing the initial trajectory data, the trajectory distance can be calculated with the combination of spatial, temporal, and directional distance using the weight coefficients; (2) the TC-FDBSCAN is adopted to cluster trajectories, which need to determine the weight coefficients and other algorithm parameters; (3) three indices are used to evaluate the clustering results.

Multicharacteristics Similarity Measurement.
A trajectory is denoted by TR in this paper. TR � (p 1 , t 1 ), (p 2 , t 2 ), . . . , (p k , t k )}, where p k � (p x k , p y k ) denotes the location (abscissa p x k and ordinate p y k , the longitude and latitude, are converted to plane coordinates) of track point k, and t k denotes the recording time of track point k. Next, we introduce measurements of multicharacteristics similarity between two trajectories TR A and TR B , where TR A � (p A1 , t A1 ), (p A2 , t A2 ), . . . , (p Ai , t Ai ) and TR B � (p B1 , t B1 ), (p B2 , t B2 ), . . . , (p Bj , t Bj ) .

Spatial Distance between Trajectories.
Determining a rule to calculate the spatial distance of track points between trajectories is the key to measure the spatial similarity of trajectories. Hausdorff distance is a common distance measurement method for two points and can be introduced into trajectory dataset. Figure 2 shows the Hausdorff distance between two trajectories.
Traditional Hausdorff distance calculates the Max-Min distance, which can only measure the dispersion of the distance between trajectories. And, it is susceptible to local shapes of trajectories. In order to improve its robustness to local effects, a modified Hausdorff distance is defined as follows: where n i denotes the number of track points in TR A , n j denotes the number of points in TR B , dist(p Ai , p Bj ) is the Euclidean distance between points p Ai and p Bj , and H(TR A , TR B ) is an absolute Hausdorff distance, which denotes the spatial distance between two trajectories.

Temporal Distance between
Trajectories. e trajectory is represented as an interval, which is related to time of starting points and time of ending points. Considering the influence of trajectory duration, the temporal distance between two trajectories is defined as follows:  Journal of Advanced Transportation Only if the starting time and ending time of two trajectories are the same (t A1 � t B1 t Ai � t Bj ), the temporal distance is 0, which means two trajectories are completely similar in temporal characteristic. When two trajectories are separated in temporal characteristic, they are not completely similar and have a temporal distance Tem(TR A , TR B ) between 0 and 1.

Directional Distance between Trajectories.
Linear directional mean is commonly used to describe the trend or average direction of a set of lines. On the basis of linear directional mean, for trajectory TR A , treating each trajectory segment (which consists of two consecutive points in TR A ) as a line, its average direction is defined as follows: where θ Ak ′ is the real direction of trajectory segment (p Ak , t Ak ), (p Ak+1 , t Ak+1 ) , which represents the angle rotated counterclockwise due east, θ Ak ′ ⊂ [0°, 360°]. e angle of linear directional mean between two trajectories is defined as follows:

Trajectory Distance.
Firstly, the spatial and directional distance need to be Min-Max normalized and converted to the dimensionless value, while the temporal distance does not need to be normalized because its value is between 0 and 1; then, the corresponding distance can be defined as follows: where H and c are the set of spatial and directional distance between all pairs of trajectories, respectively. In combination with spatial, temporal, and directional distance, the trajectory distance is defined considering the influence of weights: where α, β, and ω are weight coefficients.

Trajectory Clustering Method TC-FDBSCAN.
In this section, based on the classical DBSCAN method, an improved TC-FDBSCAN (trajectory clustering method based on fuzzy density-based spatial clustering of applications with noise) method is proposed in trajectory data clustering. In this method, membership functions are introduced to achieve fuzziness and parameters MinTRs (minimum trajectories in neighborhood) and ε (radius of neighborhood) in classical DBSCAN are replaced by MinTRs min (minimum value of minimum trajectories in neighborhood), MinTRs max (maximum value of minimum trajectories in neighborhood), ε min (minimum radius), and ε max (maximum radius).
Several extended definitions of the modified method are described in detail as follows.
Definition 1. Neighborhood: the region within a definite radius of a trajectory.
Definition 2. Core trajectory: the number of trajectories in the neighborhood of a trajectory is greater than a definite value.

Definition 3.
e local density of a trajectory: where neighbor(TR, ε max ) � TR l , |TRDist(TR, TR l )| < ε max and den(TR) denotes the number of trajectories in neighborhood with membership degree μ ϵ (TR, TR l ).
Definition 4. ε membership function: is membership function considers the fuzzies of neighborhood, which causes a trajectory having a specified membership degree in the fuzzy neighborhood of another trajectory.
is membership function considers the fuzzies of core trajectory, which causes the number of trajectories in the fuzzy neighborhood having a specified membership degree. Definition 6. Core membership degree: if μ MinTRs (den(TR)) > 0, then trajectory TR is a fuzzy core trajectory, and it belongs to a cluster with core membership degree fuzzycore(TR) � μ MinTRs (den(TR)).

Definition 7.
Border membership degree: if μ MinTRs (den (TR)) � 0, then trajectory TR should be a border or noise trajectory. en, TR can be a fuzzy border trajectory which belongs to a cluster with border membership degree: fuzzyborder(TR) � min TR l ∈neighborcore(TR) min μ MinTRs den TR l , μ ϵ TR, TR l , where neighborcore(TR) � TR l , μ MinTRs (den(TR l )) > 0 and μ ϵ (TR, TR l ) > 0}. e trajectory will be a noise trajectory if it is not a fuzzy core trajectory or fuzzy border trajectory.
Definition 8. Directly density-reachable: if trajectory TR j is in the fuzzy neighborhood of trajectory TR i and TR i is a fuzzy core trajectory, then TR j is directly density-reachable to TR i .

Journal of Advanced Transportation
Definition 9. Density-reachable: if each TR n is directly density-reachable to TR n−1 under the condition of TR n and TR n−1 are in the TR i , . . . , TR n−1 , TR n , . . . , TR j , then TR j is density-reachable to TR i . Definition 10. Density-connected: if TR k is density-reachable to both TR i and TR j , then TR i and TR j are density-connected. e aim of the TC-FDBSCAN is to divide the regions with high density into clusters, which are the largest sets of density-connected trajectories. e major difference from classical DBSCAN is that a trajectory is determined to be a core trajectory or border trajectory considering membership degree, which means the possibility of this trajectory to belong to clusters. e process is demonstrated as follows which is similar with classical DBSCAN: (1) Randomly select a trajectory to visit. If it is a fuzzy core trajectory, then add it into a new cluster with membership degree computed by equation (11) and add other trajectories in its fuzzy neighborhood into an alternate set. (2) Visit other trajectories in the alternate set; if the trajectories satisfy the fuzzy core trajectory condition, then add them into the original cluster with membership degree computed by equation (11) and add the trajectories in their fuzzy neighborhood; if they do not satisfy the fuzzy core trajectory condition, then add them into original cluster as fuzzy border trajectories with membership degree computed by equation (12). ere is a simple approach to allow users to determine two percentages, per min and per max . And, then, use per min · max TRDist { } and per max · max TRDist { } to determine the values of ε min and ε max , where max TRDist { } denotes the maximum distance between all pairs of trajectories. For MinTRs min and MinTRs max , a curve can be drawn, in which the x-coordinate is ε and y-coordinate is the number of trajectories in the set where the distance between each other is equal to ε. is curve is not monotonically decreasing; then, the corresponding y-coordinate values in the first two bends can be selected as MinTRs min and MinTRs max .

Cluster Evaluation Indices.
In the relevant studies of cluster evaluation, as the experimental data is labeled and which cluster the sample belongs to is known in advance, so the method based on accuracy such as Purity, Rand Index, and Accuracy can be used. While the trajectory data is unlabeled, there is no external information that can be used to verify the authenticity of the clustering results. en, several internal evaluation indices which consider the geometric structure of data are required to evaluate the effect of clustering results.
Silhouette coefficient (SC) [36]: where n C is the number of clusters, C i denotes the ith cluster, n C i denotes the number of trajectories in C i , and TR C i denotes the center of C i . e larger value of SC indicates better clustering results. Davies-Bouldin index (DB) [44]: where C h denotes the hth cluster, n C h denotes the number of trajectories in C h , and TR C h denotes the center of C h . e smaller value of DB indicates better clustering results. Calinski-Harabasz index (CH) [45]: where n TR denotes the number of all trajectories in dataset and TR C denotes the center of dataset. e larger value of CH indicates better clustering results.

Data Description.
e research area of the experiments is Shenzhen City, China. Shenzhen city consists of eight administrative districts (Futian, Luohu, Nanshan, Yantian, 6 Journal of Advanced Transportation Baoan, Longgang, Longhua, Pingshan, and Guangming) and one functional district (Dapeng), which are located between 113°46′ E to 114°37′ E and 22°27′ N to 22°52′ N. In this study, taxi trajectory data is used in experiment, and it is composed of a series of sample points collected by vehicular GPS equipment.
Each sample point includes the information of the license plate, location (latitude and longitude), recording time, and instantaneous speed and state (0 represents vacant and 1 represents occupied). We use the data collected by 1000 taxies during one week in May, 2019, and we divide them into two groups to conduct experiment separately, which are weekdays (from May 13th to May 17th) and weekends (from May 18th to May 19th). In order to reduce the influence of abnormal points on clustering results, the original data is preprocessed firstly, which mainly includes eliminating the outliers (latitude and longitude are 0 or out of right range and the state is neither 0 or 1) and interpolating the missing points (recording time interval discontinuity e travel time and travel distance of trajectories on workdays and weekends can be calculated, respectively, as shown in Figure 3. e percentage represents the ratio of the number of trajectories in the interval to all trajectories. It can be concluded that both on workdays and weekends, most passengers prefer to take taxi in a short (0 to 5 km) or medium (5 to 15 km) travel distance and takes no more than 20 minutes. Few passengers take more than 1 hour by taxi.

Parameter Configuration.
In the proposed method, the parameters including α, β, ω, ε min , ε max , MinTRs min , and MinTRs max are needed to be determined to obtain reasonable clustering results. In parameter configuration and comparison, we adjust parameters and select the reasonable parameter combination according to the Silhouette Coefficient (SC) of the clustering results. Firstly, set up several combinations of weight coefficients to compare the clustering results, and the trajectory distance TRDist can be calculated meanwhile. Secondly, determine two percentages per min � 40% and per max � 60% according to the approach mentioned in Section 2.3; then, ε min and ε max can be determined according to the value of max TRDist { }. In this part, max TRDist { } is equal to 1 in these combinations, so we select ε min � 0.4 and ε max � 0.6. irdly, the statistical curve is drawn to select MinTRs min and MinTRs max as mentioned in Section 2.3. For workday data, MinTRs min � 1876 and MinTRs max � 2353, which are shown in Figure 4(b). While, for weekend data, MinTRs min � 782 and MinTRs max � 986, which are shown in Figure 5(b). Finally, the optimal parameter combination is determined in the condition that the value of SC is largest in all combinations.
Next, the process of parameter configuration for workday data and weekend data are demonstrated as follows, respectively. Totally, we compare twenty combinations of weight coefficients, and the detailed results including the SC, the number of clusters, and the number of noise trajectories of each clustering result are shown in Tables 1 and  2. We can observe the points which have largest value of SC from Figures 4(a) and 5(a). ese points reflect that the clustering results under the corresponding weight combination are the best result for the case study. And, the best results are highlighted in bold in Tables 1 and 2. Finally, for workday data, we select α � 0.5, β � 0.3, ω � 0.2, ε min � 0.4, ε max � 0.6, MinTRs min � 1876, and MinTRs max � 2353, while, for weekend data, we select ε max � 0.6, MinTRs min � 782, and MinTRs max � 986.

Clustering Results.
Based on the selected parameters, we then adopt the proposed method to cluster taxi trajectory data in Shenzhen city. Figure 6 shows travel patterns of taxi trajectory data collected on workday and weekend. Tables 3  and 4 show the detailed temporal and directional information of each cluster. e value of main direction represents the angle of rotation counterclockwise in terms of east.
We can conclude some findings comparing the final results between workday and weekend data. Several common phenomena are observed as follows. (1) Figure 6(a) do not have a concentrate pattern, while Cluster 4 concentrates on Guangming district, the travel trend is the southeast, and Cluster 5 concentrates on Futian district, the travel trend is towards the east). (4) Clusters which have opposite direction are found in the workday and weekend results, respectively (for example, in Figure 6(a) and Table 3 ere are also obvious differences between workday and weekend. Firstly, the clustering results of weekend data is better than workday data according to the value of SC (workday: 0.6382 and weekend: 0.6418). Secondly, on weekend, the range of clusters are more widespread, and residents tend to travel to the suburb (for example, in Figure 6(b), Cluster 2 and Cluster 3 reflect the larger travel range, the trajectories travel on roads including Pingshan avenue and Pingkui road are added to the final cluster).
irdly, in contrast to workday, the trajectories of clusters reflect travel in night on weekend (for example, in Figure 6(b) and Table 4, the trajectories in Cluster 4 concentrate on night periods, which during 22 : 00 to 0 : 30).

Discussion
In order to verify the effectiveness of the proposed method, we compare it with other commonly used clustering methods in this section. e case data is 10,000 trajectories randomly selected from the original data (from May 13th to May 19th). ree internal evaluation indices such as Silhouette Coefficient, Davies-Bouldin index, and Calinski-Harabasz index as mentioned in Section 2.4 are adopted to evaluate the clustering results.

Hard C-Means (HCM)
. HCM or K-means clustering algorithm is an iterative solution of the cluster analysis algorithm. In this method, the data is divided into C groups, which are randomly selected from C objects as the initial clustering centers. en, for each object, calculate the distance between every seed clustering center to each object from its nearest cluster center. e clustering centers and the objects assigned to them represent a cluster. Since each sample has been assigned, the clustering center of the cluster is recalculated according to the existing objects in the cluster. is process is repeated until a termination condition is met. e termination condition can be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers change again, and the error sum of squares is locally minimum.

Fuzzy C-Means (FCM).
Fuzzy C-means clustering is a kind of clustering algorithm which uses membership degree to determine the degree of each object belonging to a certain cluster. It is an improvement of the earlier HCM clustering method. FCM divides the original data into C fuzzy groups and finds the clustering center of each group in order to minimize the value function of the nonsimilarity index. e main difference between FCM and HCM is FCM applying  fuzzy division so that each given trajectory is determined by the membership degree between 0 and 1. In accordance with the introduction of fuzzy division, the membership matrix has normalization provisions so that the sum of membership degrees of a dataset is always equal to 1.

Agglomerative Nesting Algorithm (AGNES).
Agglomerative nesting algorithm adopts a bottom-up strategy. Each object is initially treated as a cluster, and these clusters are then merged step by step according to some criteria. e distance between two clusters can be determined by the similarity of the closest data in the two different clusters. e merging process of clustering is repeated until all objects meet the number of clusters.

DBSCAN.
DBSCAN is a density-based clustering algorithm, which generally assumes that clusters can be determined by the density of the sample distribution. Samples of the same cluster are closely related to each other, and there must be other samples of the same cluster not far away from any sample of the cluster. A cluster is obtained by grouping closely related samples together. e final result of all clusters is obtained by dividing all groups of closely related samples into different clusters. e definitions of DBSCAN algorithm are similar to the descriptions in Section 2.3 without membership constraint.

Shared Nearest Neighbor Clustering (SNNC).
e shared nearest neighbor clustering algorithm was proposed by Jarvis and Patrick, where a link is created between a pair of points p and q, if and only if p and q have each other in their closest k-nearest neighbor [46]. is algorithm is an extension of the DBSCAN. e basic idea of SNNC is based on determining the core points around which clusters with various sizes and shapes are built, without worrying about determining their number [47]. Counting the number of points shared between two points p and q in their k-nearest neighbor list based on the distance metric allows us to determine the similarity between them. e greater the number of shared points, the higher the similarity between p and q.

Results' Discussion.
In order to apply these approaches to the case study in comparison analysis, the distance in HCM, FCM, AGNES, DBSCAN, and SNNC is calculated by the trajectory distance as mentioned in our method. e weight coefficients are determined, so the trajectory distance is described by e detailed results are shown in Table 5. Table 5 and Figure 7 show the effectiveness evaluation of different clustering approaches, and we can observe their   Figure 7, the DB index and CH index describe the similar effect: (1) TC-FDBSCAN is obviously superior to other methods; (2) AGNES is obviously inferior to others; (3) HCM and FCM show the similar results, while FCM is better than HCM; (4) DBSCAN and SNNC show that their clustering performance are close, while SNNC is better than DBSCAN. Overall, the proposed method can find better clustering division and provide high clustering accuracy for large-scale trajectory data in travel pattern analysis.

Conclusion
Clustering taxi trajectory based on similarity measurement is a widely applied way to explore urban travel patterns. is study proposes an improved TC-FDBSCAN to uncover urban travel patterns. e taxi trajectory data collected in Shenzhen city is used to evaluate clustering results in the case study. e dataset is divided into two parts, workdays and weekends, which are be used in clustering analysis and model comparison. Some main findings are concluded in following aspects. (1) Both on workdays and weekends, the trajectories in clusters are mainly distributed on the arterial roads. However, clustering results show that, on weekends, the range of residents' travel is wider than that analyzed on workdays.
(2) Introducing the fuzzy theory into traditional DBSCAN algorithm can improve clustering performance according to three evaluation indicators. (3) Different attributes of trajectories have different influences on clustering results according to the values of weight coefficients. ere are still some limitations which need to be improved in future study. On the one hand, other fuzzy clustering methods need to be studied to reduce computational complexity of the algorithm. On the other hand, other fuzzy theory such as weight entropy should be combined in the trajectory clustering method. Moreover, the proposed method is also necessary to be applied in different cities to prove its universality.

Data Availability
e trajectory data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.