Anomalous Trajectory Detection Using Masked Autoregressive Flow Considering Route Choice Probability

Taxis play a critical role in public traﬃc systems, and they deliver myriad travelers with convenient service due to temporal-spatial availability. However, anomalous trajectories such as trip fraud often occur due to greedy drivers. In this study, we propose an anomalous trajectory detection method that incorporates Route Choice analysis into Masked Autoregressive Flow, named MAFRC-ATD. The MAFRC-ATD integrates data-driven and model-based methods. First, we divide the urban traﬃc network into small grids and represent subtrajectories with a sequence of grids. Second, based on the subtrajectories, we employ the MAFRC-ATD model to calculate the anomaly score of each trajectory. Third, according to the anomaly score, we can identify the anomalous trajectories and distinguish between intentionally and unintentionally anomalous. Finally, we evaluate our method with a real-world dataset in Porto, Portugal. The experiment demonstrates that the MAFRC-ATD can eﬀectively discover anomalous trajectories and can identify the unintentional detours due to traﬃc congestion.


Introduction
Taxis provide convenient and available services for a multitude of urban passengers. However, passengers may be victims who are overcharged deliberately by greedy drivers. Two examples are as follows.
Example 1. Many people, mostly tourists, are victims of taxi-driving frauds committed by greedy taxi drivers who overcharge passengers by deliberately taking unnecessary detours [1]. In New York City, nearly 1.1 million dollars have been overcharged due to 21,819 taxicab drivers swindling money out of customers arbitrarily [2].
Example 2. Car hailing is a combination of the Internet and sharing economy, fundamentally changing the way people travel. However, car-hailing platforms lower employment requirements and increase effective supply, bringing safety and economic risks. e risk of overcharging, leaking personal information, or even crimes [3] could occur due to its strong characteristics of a "stranger economy." us, to avoid potential crises, scouring the potential anomalous behaviors from historical record data of cabbies is prohibitively crucial.
To tackle this issue, collecting the GPS traces and discovering the anomalous trajectories from the sheer volume of data is an effective approach. e rapid advances in big data and machine learning grant us an unprecedented opportunity to detect anomalous trajectories via the trace data of moving objects.
Anomalous trajectory detection has attracted extensive research attention. Recently, with the proliferation of deep learning and deep generative models (DGMs), anomalous trajectory detection methods based on deep learning have made remarkable progress in recent years. However, the existing methods are usually faced with the following challenges: Spatiotemporal information for DGM: as a branch of deep learning, deep generative models are employed to identify anomalous trajectories by reconstruction error.
However, most deep generative models [4][5][6][7] are difficult to directly capture sequential information of trajectory data. Unintentional anomalous trajectory: mostly, drivers may take unintentionally anomalous detours with their experience to avoid traffic congestion. e experience of drivers, particularly, is crucial for route recommendations when traffic congestion occurs. However, traditional anomalous trajectory detection methods seldom consider mining this unintentionally anomalous trajectory pattern.
We take an unintentional anomaly for instance. As shown in Figure 1, there are two trajectories denoted as t 1 and t 2 . Obviously, t 1 is normal, whereas t 2 seems to take a detour trace. However, the travel time of t 1 is similar to the travel time of t 2 , suggesting that the driver of t 2 may take a detour to reduce the travel time if traffic congestion or a traffic incident occurs. In real life, this unintentional anomaly is critical to route recommendations for traffic congestion.
To mitigate the above problems, this study proposes a framework that integrates the route choice model in the masked autoregressive flow (MAF), which is one framework of DGMs. Our key insight is that anomalous trajectory detection can be transformed into a distribution estimation task. MAF can fully capture the sequential information of trajectory data by leveraging the autoregressive mechanism and ought to assign the low probability density at the anomalous trajectory (i.e., the likelihood). On the contrary, to capture the actual route choice behavior, the route choice model is applied to distinguish unintentional anomalies due to traffic congestion or traffic incident. When traffic incident or traffic congestion occurs, people will normally take other routes which may be longer in distance, to avoid congestion or incident. Henceforth, the route choice model can successfully mine the patterns of unintentionally anomalous trajectories through the choice probability of corresponding routes.
In a nutshell, this study proposes an anomalous trajectory detection method based on masked autoregressive flow considering route choice analysis, named MAFRC-ATD. First, we extract valid trajectories from the historical dataset and split the traffic network into equal-sized grids; hence, subtrajectories can be represented with a sequence of grids. Second, we employ the MAFRC-ATD model to calculate the anomaly score of each trajectory through the likelihood of subtrajectories and the choice probability of the corresponding route. Finally, according to the anomaly score, each trajectory can be determined whether it is anomalous and, if anomalous, which type of anomalous trajectory it belongs to.
Overall, the contributions of this study can be summarized as follows.
We propose an unsupervised framework based on normalizing flow, which takes advantage of the spatiotemporal information and autoregressive characteristics of trajectories. To explore the unintentional anomalies, we incorporate the route choice model in the normalizing flow. As such, we develop a method MAFRC-ATD which integrates the data-driven and model-based method. In MAFRC-ATD, the route choice model is exploited to distinguish the unintentionally anomalous trajectories due to traffic congestion or traffic incident. We evaluate the proposed method in a real-life dataset in Porto, Portugal. e results show that the MAFRC-ATD model can effectively discover anomalous trajectories and successfully discover unintentional trajectories due to traffic congestion.
is study is organized as follows. Section 2 briefly reviews the related work. Problem formulation and proposed methodology are introduced in Section 3. In Section 4, we evaluate our model with a real-life dataset and analyze the results. Section 5 concludes this study and outlines the future scope.

Related Work
In this section, we will give a brief literature review on anomalous trajectory detection, the probability density estimation with the flow-based model.

Anomalous Trajectory Detection.
Trajectory anomaly detection methods can be divided into statistic-based, cluster-based, grid-based, classification-based, and deep learning methods.

Statistic-Based Method.
With the help of the popular routes, Zhu et al. [8] proposed time-dependent popular routes-based trajectory outlier detection (outlier is also referred to as anomaly detection in this study) method, considering both spatial and temporal anomalies simultaneously. Liu et al. [9] found that top drivers were better at finding the fastest route than ordinary drivers. However, these algorithms mainly focus on the geometric information of trajectories and neglect semantic information. us, Qin et al. [10] proposed the ProbDetect model, which transforms anomaly detection problems into a route choice problem considering semantic information such as trip time and fare, as well as traffic variability. In addition, Kong et al. [11] employed a statistical model to calculate the travel distribution of travel time, driving distance, and travel cost, making it possible to explore the reason for outlier trajectories.

Cluster-Based Method.
Knorr and Ng [12,13] and Knorr et al. [14] applied the distance-based algorithm to detect trajectory anomalies in multidimensional multiscale space. However, "few" anomalies may be averaged over the whole trajectories' dataset. Based on subtrajectories, Lee et al. [15,16] proposed TRAOD, which includes two phases: partitioning a trajectory into a set of line segments and then detecting outlying line segments. In addition, Bu et al. [17] and Yu et al. [18] built local clusters and neighbor-based trajectory outliers, respectively, to discover the outlier of trajectory streams. In contrast to the above distance-based mechanism, Wang et al. [19] proposed an anomalous trajectory detection method based on edit distance and hierarchical clustering, which was employed to determine the number of clusters.

Grid-Based Method.
Ge et al. [20] proposed an algorithm Top-Eye, where the continuous data space was discretized into grids. Later, Ge et al. [21] applied travel route evidence and driving distance evidence to the trip fraud detection system, finding that the fraudulent drivers do earn more income than drivers without scams. Additionally, Kong et al. [22] proposed the LoTAD method to mine anomalous regions, where the center of each region and the region's boundary were determined by the improved k-means method and the Voronoi graph while conforming to traffic conditions. Wang et al. [23] proposed a detection method based on the difference and intersection set (DIS) distance.
Inspired by the isolation mechanism [24] that the anomalous trajectories are susceptible to being isolated, iBAT [25] was proposed to identify the anomalous trajectories. Based on iBAT, Chen et al. [26] and Sun et al. [27] proposed the online model iBOAT and application, respectively.

Classification Method.
Motion-alert algorithm [28,29] leveraged the support vector machine (SVM) to learn the features of the extracted pattern motif, which are a sequence of motion features related to time and location. Similarly, Piciarelli and Foresti [30] employed a single-class SVM, and then, Piciarelli et al. [31] introduced the clustering method to improve the robustness.

Deep Learning.
With the development of the neural network, some researchers have employed deep learning methods, such as recurrent neural networks (RNN) [32,33], to discover anomalies. Generative models were widely employed to discover anomalous trajectories. GM-VSAE method [34] and IGMM-GAN method [4], for instance, can be applied to uncover outliers in the latent space. In addition, Dias et al. [35] proposed a method using an NF scheme, effectively discovering anomalies in contrast to the GMM method. However, it might be unreasonable to directly flatten multidimensional trajectory data into one dimension and unpractical to evaluate the model in bus data since buses might take bypass routes as well.

Probability Density Estimation with the Flow-Based
Model. Many tasks in machine learning, including anomalous trajectory detection, can be formulated as learning and manipulating density distribution. e problem of modeling a probability distribution from a set of given samples is a major goal of probabilistic unsupervised learning or generative modeling [36]. A generative model ought to assign low likelihoods to samples from different distributions, making it attractive to anomaly detection methods.
Regardless of its wide application nowadays, autoregressive architecture was even overlooked in previous work [49] due to the huge cost of high dimensionality. To alleviate this problem, several approaches have been proposed, including recurrent mechanism, masked mechanism, and coupling layer. Recurrent mechanism [50] shares parameters by jointly using an RNN. e main drawback of the RNNbased mechanism is that the internal state of RNN is calculated sequentially, with higher computational complexity than the other two mechanisms. e masked mechanism [51] uses a feedforward neural network and removes appropriate connections by multiplying a binary matrix. However, masked mechanism MAF is efficient to evaluate, but slower to invert, indicating that MAF is appropriate to estimate density probability. Coupling layers [42], which splits the input into two disjoint partitions that the first part stays the same as the input and the second part undergoes an affine transformation, which are equally fast to evaluate the probability density and invert for sampling (sampling from the model and evaluating the model's density are two invertible operations for normalizing the flow model).

Route Choice Model.
Route choice modeling is essential to evaluate travelers' choice perception within a set of alternative routes, helping analyze and understand travelers' behavior [52]. Generally, a route choice model has two stages [53]: first is the choice set of possible alternative routes and second is the choice probability of a path, given the route choice set [54,55]. Route choice models mainly include logit structures [56], GEV structures [57], and non-GEV structures [58].

Problem Formulation
Problem formulation is as follows.

Problem Definition and Framework.
Problem definition and framework is described in the following secions. [59], the goal of anomaly detection is to find data patterns that differ from what is expected. In our work, such unexpected patterns are related to trajectories sufficiently different from the majority which is considered to be normal, such as detour traces. e definitions of this anomalous trajectory detection problem are as follows.

Definition 1 (trajectory).
A raw trajectory T � q 1 , q 2 , . . . , q n is a sequence of records, and each recorded q i is represented by (lon i , lat i ), where (lon i , lat i ) is a pair of latitude and longitude coordinates. q 1 and q n are the origin and destination of the trajectory T, respectively. Definition 2 (anomaly). Anomaly is data that are different from others in the dataset, making people assume that the outlier of these data is not caused by random factors, but by a completely different mechanism [60].
Definition 3 (anomalous trajectory). Given an OD pair, we assumed that there are a set of trajectories whose pick-up locations and drop-off locations are the origin and destination respectively. Let D � T 1 , T 2 , . . . , T M denote this set of trajectories, where the anomalous trajectory is significantly different from the majority in this OD pair.
Definition 4 (unintentionally anomalous trajectory). e unintentionally anomalous trajectory is an anomalous trajectory in a spatiotemporal pattern, which is caused by travelers' unintentional detours to avoid traffic congestion.

Overview of the Methodology.
To achieve the goal of anomalous trajectory detection, we propose the MAFRC-ATD model, and the methodology is shown in Figure 2.
In detail, we first deal with treating the raw trajectories in the following procedures: (1) removing the invalid trajectories, (2) discretizing the traffic network into equal-sized grids (that is grid mapping), (3) extracting all traces crossing the same OD cell pairs, and (4) partitioning trajectories into subtrajectories.
Based on the subtrajectories, we employ the MAFRC-ATD to discover anomalies. e anomaly score of each trajectory can be computed through the likelihood of subtrajectories and the choice probability of the corresponding route.
Finally, according to the anomaly score, each trajectory can be determined whether it is anomalous and, if anomalous, which type of anomalous trajectory it belongs to.

Data Preprocessing.
As stated in Section 1, data preprocessing mainly includes two phases: grid mapping and partitioning trajectories into subtrajectories. Figures 3(a) and 3(b), grid mapping is to split the city traffic network into small grids. In detail, for given N lon and N lat , the urban traffic network can be split into N lon × N lat equal-sized grids. Consequently, each taxi trajectory T k � q k,1 , q k,2 , . . . , q k,i , . . . , q k,n can be represented with a sequence of traversed grids T k � t 1 , t 2 , . . . , t i , . . . , t n , where t i � g(q k,i , N lon , N lat ) and g is a grid mapping function.

Partitioning Trajectories.
As discussed by Lee et al. [15], the difference between anomalies and normal trajectories might be averaged out over the whole trajectory. To alleviate this problem, trajectories are segmented with subtrajectories. A trajectory T k � t 1 , t 2 , . . . , t i , . . . , t n can be partitioned with equal d-length sized subtrajectories as follows: Ω T k : s 1 , s 2 , . . . , s t , . . . , s l , where Ω(T k ) is a set of continuous subtrajectories and s l can be completed as last d-length GPS points. us, we can obtain the set of subtrajectories Ω(T) regarding this OD.

Masked Autoregressive Flow.
As stated in Section 1, the anomalous trajectory detection problem can be formulated as the probability density estimation problem. NF ought to assign low likelihoods to data from different distributions, making it possible to discover anomalies with low-likelihood predictions. MAF is applied to detect anomalous trajectories as follows.
NF pushes an initial density (such as normal distribution) through a series of transformations to produce a richer distribution with more expressive power than the constituent components of these transformations [51]. Assuming that s is a d-length subtrajectory and u is the d-dimensional random variable of the base distribution p u (u), then s can be represented through the transformation F: where u ∼ p u (u), and the transformation F must be differentiable and invertible. en, the density of s, i.e., p(s), can be calculated as follows: where the inverse of differentiable transformation F − 1 and Jacobian determinant |detJ F − 1 (s)| can be implemented by MAF.
Since the spatiotemporal autoregressive character of s, p(s) can be decomposed into a product of conditional densities using the chain rule of probability: Assume that conditional distributions p(s h | s 1: h− 1 ) are parameterized as single Gaussian: where where u h ∼ N(0, 1). us, the invertible transformation F is easy to access as follows: where the absolute determinant of the Jacobian matrix is erefore, p(s) can be successfully evaluated through (3), where the inverse transformation and its Jacobian determinant can be calculated by (6) and (8) After calculating the likelihood of subtrajectories, the likelihood of the whole trajectory can be calculated as follows. For any trajectory T k , the likelihood of the  Journal of Advanced Transportation corresponding subtrajectory p(s t ) can be obtained via the aforementioned process, and the likelihood of trajectory T k can be defined as the mean value of the p(s t ), i.e., where t � 1, 2, . . . , l.
3.4. Route Choice Analysis. As mentioned above, we have detected anomalous trajectory with the MAF method, mostly using geometrical information and sequential information. However, in the real world, anomalous trajectories may occur due to lots of factors, such as travel time and traffic conditions. As described in Section 1, detours can be unintentionally generated by drivers whose intentions are to avoid traffic congestion to reduce trip time during peak hours. Apparently, it is extremely crucial to explore this route choice motivation behind anomalous behaviors. However, modeling the motivation behind the choice behavior could be problematic due to the complicated traffic conditions of the road network. e impact of traffic control regulation at an intersection on the Network Macroscopic Fundamental Diagram (NMFD) [62,63], for example, could also influence the aggregate traffic conditions of the road network. In this study, following Qin et al. [10], we adopt a random utility model based on a multivariate Gaussian model to explore the motivation behind the random choice behaviors.
Assumed there are a set of trajectories D � T 1 , T 2 , . . . , T M for each pair of OD, C � C 1 , C 2 , . . . , C m is a set of choice routes that all distinct trajectories pass through in this OD. For any route C j in the route choice set C, L C j and T C j are its travel length and trip time, and the generalized cost is where c is the velocity equivalent coefficient, which can be interpreted as the trip length value for a unit of the trip time. e choice probability of the route C j can be defined as a vectorized form: where Z is the difference of generalized cost, i.e., Z � (G c j − G c 1 , . . . , G c j − G c m ) T , and Z and 0 are both R (m− 1)×1 vectors. us, p C j can be calculated by the integral of the probability density function for Z: where μ Z and Σ Z are the mean vector and covariance matrix of Z. To calculate μ Z and Σ Z , Z are further decomposed into as follows: where μ Z and Σ Z can be derived by μ U and Σ U . Obviously, and Σ U is Furthermore, Σ U is given by where σ 2 C i∩j is the variance of C i∩j which reflects the overlapped portion of two routes C i and C j , and C i and C j are arbitrary routes in route set C, for i, j � 1, 2, . . . , m.
In this way, the choice probability of the route C j can be obtained by equation (12) by substituting equations (15) and (17) into equation (14).
3.5. Anomaly Score. As described in Section 1 and Section 3.1, the MAFRC-ATD model integrates the data-driven learning and route choice model. ereby, the anomaly score is designed to combine the two aspects as follows. e likelihood of a given trajectory and the route choice probability of the given trajectory can be calculated in Sections 3.3 and 3.4. For a given OD pair, we assumed that C j is the corresponding route of trajectory T k , p(T k ) and p C j are the likelihood of T k and choice probability of route C j respectively, and an anomaly score E(T k ) is defined as where m is the number of all routes for this OD pair and ξ is the coefficient of unintentionally anomalous. In (18), sigmoid (·) is a sigmoid function, i.e., sigmoid(x) � (1/1 + e − x ), and is monotonically increasing with outputs in the range 0 to 1. Obviously, p C j and p(T k ) of normal trajectories are higher than anomalous trajectories.
Assuming that threshold δ is applied to decide whether the trajectory is anomalous, there are three cases.
When E(T k ) ≥ 1, i.e., p C j − (1/m)ξ ≤ 0, T k is likely to be an intentionally anomalous trajectory When E(T k ) <1 and E(T k ) ≥ δ, T k can be categorized as unintentionally anomalous trajectory When E(T k ) >0 and E(T k ) < δ, T k can be regarded as a normal trajectory

Case Study
In this section, we will evaluate our MAFRC-ATD model using the real-world dataset and present the results as well as discussions of experiments.

Dataset Preprocessing.
Our experiments are conducted on the real-world taxi trajectory dataset provided by Kaggle ECML-PKDD 2015 competition (https://www.kaggle.com/ c/pkdd-15-predict-taxi-service-trajectory-i/data). In detail, the dataset is generated from 442 taxis running in the city of Porto from 01/07/2013 to 30/06/2014. ere are 1710670 raw trajectories in total and each taxi reports its location every 15 seconds.
Out of approximately 1.7 million trajectories from the original dataset, we remove the trajectories with less than 20 GPS. After filtering, we split the city traffic network space into 100m × 100m grids, i.e., g l � g w � 100m. Consequently, GPS points can be represented with grids and each trajectory can be denoted by a sequence of grids. Subsequently, we extract 4 OD pairs, denoted as OD 1 to OD 4 , respectively. In this study, we segment a trajectory into subtrajectories with 20 GPS points, that is, d � 20. Given the fact that there is no labeled dataset available, we choose to manually label anomalies in this paper.
Additionally, inspired by Liu et al. [34], to alleviate the problem of unbalanced learning in anomaly detection, we select several detour paths from which anomalous trajectories are generated. e detour paths, called generated routes, traverse the real-world links and are implemented in ArcGIS and Google maps. Among the generated routes, generated anomalous trajectories can be generated through sampling from the GPS points of generated routes. e generated anomalous trajectories are injected into the original dataset for "data argument." As stated above, the related information of those OD pairs is shown in Table 1, where O and D are represented with (longitude and latitude), and each OD pair can be shown in Figure 4. Out of all testing trajectories, the coral color lines represent the original historical dataset in the testing dataset, whereas other colors represent the generated routes from which the generated anomalous trajectories are sampled.

Baseline.
In this study, we will evaluate our model compared with the following methods.
iBAT [25]: iBAT exploits an isolation mechanism that anomalous trajectories are susceptible to, i.e., anomalies are easy to be separated from the majority of the trajectories since anomalies are "few and different." It is more likely to be anomalous for a higher anomaly score. ATDC [23]: ATDC proposes a distance metric, namely, the DIS distance, to evaluate the similarity between any two trajectories. Based on the DIS distance, ATDC can further calculate the anomalous score of a trajectory and then determines if it is an anomalous trajectory.

Evaluation Criteria.
In this study, we employ the area under the receiver operating characteristic curve (AU-ROC) as the evaluation metric. To evaluate the performance of the anomaly detection method, detection rate (DR) and false alarm rate (FAR) (in this study, we refer to true positive rate (TPR) and false positive rate (FPR) as detection rate and false alarm rate respectively) are two important metrics. As shown in the confusion matrix Table 2, trajectories are labeled either as anomalous (i.e., positive) or normal (i.e., negative).
A trajectory will be classified as one of our possible outcomes: (1) true positive (TP): an anomalous trajectory is correctly predicted as anomalous; (2) e FAR is defined as the proportion of normal trajectories that are predicted as anomalous, i.e., e better anomaly detection method will have a higher DR and a lower FAR. e ROC curve shows the detection rate (Y-axis) against the false alarm rate (X-axis) at various thresholds, which depicts the tradeoff between false alarms and accurate classifications. AU-ROC value is the area under the ROC curve, and the anomaly detection method is of better quality for higher AU-ROC. To explore the influence of anomalous trajectory detection on different OD pairs instead of a given OD pair, we    Table 3. e results indicate that MAFRC-ATD achieves the best results over the 4 OD pairs and has the potential power to be well-suited for different situations. e experiments indicate that the MAFRC-ATD model could effectively capture the autoregressive characteristics of trajectories and model the probability density of subtrajectories. However, the reason for the undesirable results of iBAT and ATDC might be that those methods are based on distance or isolation mechanisms, mainly focusing on geometric information of the trajectory and neglecting the sequential information of the trajectory. Obviously, iBAT is well adapted to identify various detours, but may not be suitable for situations where anomalies are intertwined with normal ones (OD 4 ). ATDC performs a little terribly when existing local anomalous trajectories (such as OD 2 and OD 3 ); it applies to intertwined situations.

Unintentionally Anomalous Trajectory Analysis.
In addition to the performance of anomalous trajectories detection, discovering unintentionally anomalous trajectories is also essential. Route choice analysis is employed to discover the unintentional anomalies as follows.
Taking OD 1 , for instance, there are 10 routes for this OD pair, denoted as r 1 to r 10 , respectively, i.e., m � 10 and C � r j , forj � 1, 2, . . . , 10 . Trajectory set as well as link information for each route regarding OD 1 can be shown in Table 4. Among all paths, r 1 contains the most amount of trajectories. For the sake of convenience, "(. . .)" in r 1 represents the trajectories of OD 1 except what are contained in r 2 -r 10 . ?? shows each route in the OD 1 .
In this study, we set velocity equivalent coefficient c as 8.91 m/s according to the mean speed value of historical trips. For any given link in OD 1 , the generalized cost of each route is calculated as the mean value of the generalized costs of all trajectories corresponding to this route.
When δ sets to 0.02 and ξ sets to 1.5, experiments show that the trajectories corresponding to C n � C 1 , C 2 , C 3 are normal, whereas trajectories to C an � C j , forj � 4, 5, . . . , 10 are anomalous. Among all anomalous trajectories, the trajectory corresponding to C un � r 6 , i.e., 82nd trajectory, is unintentionally anomalous, whereas trajectories corresponding to C un � C an /r 6 are intentionally anomalous. Simply we can find that the 82nd trajectory happened on the evening peak hours, Tuesday. e choice probability of the 82nd trajectory is large enough, i.e., greater than 1/mξ, to make the anomaly score less than 1 despite greater than δ. Henceforth, we have sufficient confidence that the 82nd trajectory (i.e., corresponding to r 6 ) has the strong potential to be chosen when taking this trip. is result indicates that the 82nd trajectory could be     unintentionally generated by drivers whose intentions are to avoid traffic congestion to reduce trip time during peak hours.
e results agree well with the route choice probability result, as shown in Figure 6.

Conclusion
Anomaly trajectory detection is crucial for preserving the legitimate interests of passengers and discovering the pattern of anomalous driving behaviors. In this study, we propose an unsupervised detection framework, namely, MAFRC-ATD, to detect anomalous trajectories. Compared to traditional methods, MAFRC-ATD can carry out the following: (1) Taking the spatiotemporal information of trajectory data into consideration, the MAFRC-ATD model exploits the autoregressive character of trajectories and effectively captures the spatiotemporal information of trajectories, making it possible to assign lower likelihoods to anomalous trajectories (2) Incorporating the route choice model to discover the unintentional anomalous due to traffic congestion, route choice analysis is employed to further explore the motivation behind random choice behaviors, such as traffic congestion Moreover, multi-od experiments are also conducted to evaluate the generalization properties of the MAFRC-ATD model. Experiments on the real-world dataset demonstrate the effectiveness of our methods over baseline methods. Furthermore, testing the proposed method with more real-world datasets is crucial to evaluate the applicability and robustness of the method, which is left for our future study.
In the future, we will extend our model online with timeseries methods, such as the self-attention mechanism. In addition, the ensembling-based NF model could perform better to alleviate the imbalance learning problem.

Conflicts of Interest
e authors declare no conflicts of interest.