Semantic Trajectory Frequent PatternMiningMethod with Fuzzy Stay Time Constraint

In the security system, transforming a large number of collected target trajectories into semantic trajectories with a less volume and high quality and mining their frequent patterns are helpful to analyze the target behavior patterns, identify hazard sources, and enhance the internal prevention, and control of the security system. Aiming at the limitation of semantic trace frequent pattern mining method dened by precise stay time in practical application scenarios, a fuzzy semantic trace frequent pattern mining method is proposed. Firstly, the membership function of fuzzy stay time is dened, so the stay time of the target at the stay point is fuzzied, and the fuzzy semantic trajectory is obtained. en, a fuzzy semantic trajectory frequent pattern mining algorithm FST-FPM (fuzzy semantic trajectory frequent pattern mining) is proposed. e FST-FPM algorithm is experimentally veried on the Geolife public dataset and the self-collected RFID positioning dataset.e experimental results show that FST-FPM algorithm can mine frequent patterns of fuzzy semantic trajectories on Geolife dataset and RFID positioning dataset, and the running time is reduced by more than 10% compared with classical PrexSpan algorithm, PrexSpan-x algorithm, and LFFT2 algorithm.


Introduction
In the security system, how to improve the protection capability of key facilities has always been a research hotspot [1]. Studying the moving track and staying time of the target at the key points in the protection area in the security system plays an important role in mining the behavior characteristics of the target and evaluating its threat degree. e moving track of the target in the protection area can be obtained through RFID positioning technology [2,3]. It is helpful to analyze the target behavior pattern by analyzing the target trajectory and mining the abnormal trajectory [4,5]. Mining the frequent patterns of the moving track of the target is of a great signi cance for analyzing the target intention, identifying hazard sources, and identifying security vulnerabilities. e original target trajectory data obtained is transformed into semantic trajectory with a lesser volume and higher quality, which is conducive to the mining of target behavior patterns [6].

Related Work.
As semantic trace frequent pattern mining has become a research hotspot, scholars have proposed various methods to mine frequent patterns in various scenarios. Cai et al. [7] constructed original tracks from photos with geographical location marks, enriched the original tracks through semantic annotation, and proposed a density-based similarity measurement method to mine semantic track frequent patterns, resulting in more detailed spatial frequent patterns. Lior et al. [8] proposed a new multi-objective mining algorithm, which uses a small amount of memory to quickly mine a given itemset by mining the support of multiple prespeci ed itemsets in the FP-Tree. Davashi et al. [9] proposed an e cient method for mining uncertain frequent patterns based on the upper bound. A new upper bound is used to constrain the expected support. By tightening the upper bound of the expected support and pruning the infrequent binomial sets and their supersets, the mining e ect is improved. Han et al. [10] proposed a frequent pattern mining algorithm Pre xSpan based on projection database. By constructing projection database, the algorithm recursively mines frequent subsets in projection database, only needs to scan the database once, and does not generate candidate itemsets, so it has higher mining efficiency. Many scholars have improved the PrefixSpan algorithm to mine frequent patterns more effectively. Tahereh et al. [11] proposed an improved PrefixSpan algorithm. When constructing the projection database, the algorithm adds a filter to each prefix to delete sub-patterns, and only outputs the largest frequent patterns. At the same time, it sets the minimum pattern length threshold to filter the short patterns in the alarm data, so that the algorithm has a high performance. Xue et al. [12] proposed the PrefixSpan-x algorithm, which uses ACautomata to optimize the search of frequent sequences and deletes the frequent sequences that do not meet the minimum length threshold when constructing the projection database, reducing the time and space complexity of the algorithm. Wu et al. [13] proposed a list-based multiple fuzzy frequent itemset mining algorithm LFFT2. e algorithm is based on type-2 fuzzy set theory and can retrieve multiple fuzzy frequent itemsets. It is superior to the traditional a priori based method in terms of execution time and the number of nodes in the search space. Chen et al. [14] proposed a fuzzy association rule mining method with the type-2 membership function. It first transfers the quantity value in the transaction to the type-2 of fuzzy value. en, according to the pre-defined splitting number of points, they are simplified to the type-1 of fuzzy values. Finally, fuzzy association rules are derived by using these fuzzy values. Lin et al. [15] proposed an incremental multiple fuzzy frequent item mining algorithm based on transaction insertion (IMF-INS) to effectively update multiple fuzzy frequent itemsets of quantitative datasets. IMF-INS uses fuzzy FUP concept to divide the converted language terms into four cases. In each case, the discovered information is updated through the design method. In addition, the fuzzy list structure is used to reduce the generation of candidate databases that do not require multiple database scans. Experimental results show that the algorithm has good performance in runtime, memory consumption, the number of determined patterns, and scalability. e results show that the performance of the designed IMF-INS is better than the most advanced batch mining progress. Li et al. [16] proposed a single-user trajectory frequent pattern mining method based on location semantics. e semantic trajectory is obtained by inverse geocoding, and the optimal candidate frequent location itemset is obtained. en, the frequent iterative calculation of long itemset is transformed into the rule operation of hierarchical set by using the intersection of spatiotemporal sequences and the divide and conquer merging method, and the supersets and subsets of frequent sequences are obtained. Liu et al. [17] analyzed 627 articles on Pythagorean fuzzy sets (PFS) in web of science from 2013 to 2020, provided conclusive and comprehensive analysis for researchers in this field, and provided preliminary understanding of PFS.

Our Contribution.
In the target behavior analysis scenario, the semantic trajectory frequent pattern mining method defined by precise stay time has great limitations. On the one hand, the behavior of the target in the same time range can represent the same pattern; on the other hand, considering the precise stay time will lead to too strong constraints and too few frequent patterns. To solve this problem, this paper proposes a fuzzy semantic trajectory frequent pattern mining method based on the moving trajectory and stay time of the target in the security environment. Firstly, based on the fuzzy set theory, the stay time is fuzzified, and the fuzzy semantic trajectory is obtained. en, a fuzzy semantic trajectory frequent pattern mining algorithm FST-FPM is proposed. Finally, the effectiveness of FST-FPM algorithm is verified on real datasets.

Organization.
e rest of this paper is organized as follows: the second chapter is the problem definition and formal description, which introduces the definition of fuzzy semantic trajectory frequent pattern mining. e third chapter is the frequent pattern method of fuzzy semantic trajectory, which introduces the definition method of fuzzy stay time membership function, the method of stay point extraction, and the method of fuzzy semantic trajectory construction.
e fourth chapter describes the FST-FPM algorithm, and introduces the detailed steps and pseudo code of FST-FPM algorithm. e fifth chapter is the experimental results and analysis, which introduces the experimental results and analyzes the reasons. e sixth chapter is the conclusion, which summarizes the work and contributions of this paper.

Problem definition and Formal Description
After extracting the stay point from the original trajectory set traj, it is transformed into semantic trajectory ST, and the processing of fuzzy stay time is introduced to obtain the semantic trajectory with fuzzy stay time constraint FST. en, the FST-FPM algorithm proposed in this paper is used for frequent pattern mining. e specific method is to fuzzy the stay time of personnel in a specific area into several levels through the fuzzy set theory, calculate the membership s and support P of fuzzy stay time, and then, give a minimum support threshold σ and fuzzy stay time membership threshold ρ to find out all the frequent patterns of minimum support P > σ and fuzzy stay time membership s > ρ in the FST. e semantic trajectory frequent pattern with fuzzy stay time constraint FTFP can be obtained.
In this paper, the related concepts used in mining semantic trajectory frequent patterns with fuzzy stay time constraints are defined as follows: Definition 1. the target track is marked as traj, and its form is as follows where, oid is the person Id, Lat is the decimal latitude, Lon is the decimal longitude, Data is the date, and Time is the time when the person arrives at the point.
Definition 2. if the stay time of the target in a certain area is greater than or equal to the threshold ω, the area is called a stay point and is recorded as R, R � (loc, t s , t e ).Where, loc represents the geographical area number, t s represents the time of entering the geographical area, and t e represents the time of leaving the geographical area.
Definition 3. the time difference between the target entering and leaving a certain area is called the stay time in the area, which is recorded as Δt, Δt � t e − t s .
Definition 4. a sequence consisting of n stay points is called the target semantic trajectory, which is recorded as ST, and its data form is where, oid represents the personnel ID, R i represents the ith stay point, i � 1, 2, . . . , n. Definition 6. let the stay time fuzzy set be A 1 , A 2 , · · · , A P , and the membership degree of stay time Δt on fuzzy set A j is composed of a p-ary set, denoted as S(Δt), whose form is where, μ A j (Δt) represents the membership degree that Δt belongs to the fuzzy set A j , j � 1, 2, · · · , p. e stay time of the target is blurred into p levels, and the fuzzy set A j represents the jth fuzzy stay time level. e closer the μ A j (Δt) is to 0, the lower theΔt belongs to A j , the closer the μ A j (Δt) is to 1, the higher the Δt belongs to A j .
where, oid represents the target number and S i (Δt i ) represents the fuzzy stay time membership set of the ith stay point R i , i � 1, 2, . . . , n. In equation (4), let K � |FST| be called the K-term fuzzy semantic locus.

Definition 8. given a fuzzy set
Given a minimum support threshold σ and a fuzzy stay time membership threshold ρ, if the fuzzy semantic trajectory FST has support P ≥ σ and fuzzy stay time membership μ A j (Δt i ) ≥ ρ, then, this fuzzy semantic trajectory FST is called K-term fuzzy semantic trajectory frequent pattern and is recorded as FTFPK � |FST|i � 1, 2, . . . , n. According to the above definition, the semantic trajectory frequent pattern mining under fuzzy stay time constraint is the process of adding fuzzy stay time membership to the semantic trajectory to make it become the semantic trajectory of fuzzy stay time constraint, and then, mining the semantic trajectory frequent pattern of fuzzy stay time constraint.

Fuzzy Semantic Trace Frequent Pattern Mining Method
According to the definition of fuzzy semantic locus, a method for mining frequent patterns from fuzzy semantic locus is proposed. Firstly, the target semantic trajectory ST is obtained by extracting the stay point of the target trajectory; en, the stay time is fuzzified, and the fuzzy membership degree set S(Δt) and support P degree of the target semantic trajectory are calculated. e fuzzy semantic trajectory FST is obtained by adding the fuzzy stay time membership set S(Δt) to the target semantic trajectory. Finally, the proposed fuzzy semantic trajectory frequent pattern mining algorithm is used to mine all fuzzy semantic trajectory frequent patterns FTFP from the fuzzy semantic trajectory dataset FST.

Calculation Method of Membership Degree of Fuzzy Stay
Time. In the scenario of personnel behavior pattern analysis, considering the accurate stay time of personnel, it is impossible to mine effective frequent patterns. erefore, the stay time of personnel is fuzzified by fuzzy set theory, and the stay time can be fuzzified into p levels according to expert experience. In order to get the membership degree of fuzzy stay time defined in Definition 6, the membership functions of fuzzy stay time at each level need to be defined, respectively. In this paper, the membership function of fuzzy stay time is divided into three level: small, medium, and large. e membership function of small fuzzy stay time is defined as: e membership function of medium fuzzy stay time is defined as: Scientific Programming e membership function of large fuzzy stay time is defined as: where,

Fuzzy Semantic Trajectory Construction Method.
is section illustrates the fuzzy semantic trajectory construction method. e form of the target track collected in the security system is shown in Definition 1. e information of one sampling point includes the target number, the coordinate value of the sampling point, and the date and time of the sampling point, such as traj � 01, [3, 7, 20210401, 12: 53: 12], [3, 8,  Given a short stay time fuzzy set, according to formula (5), the parameter is set to Δt 1 � 10, Δt 2 � 40, and the unit is minutes. Given a fuzzy set with a long stay time, according to formula (7) the parameter is set to, and the unit is minutes.
us, the fuzzy stay time membership set of the target semantic trajectory is calculated. Add fuzzy stay time membership to the target semantic trajectory to obtain the fuzzy semantic trajectory as shown in Definition 7, such as

FST-FPM Algorithm Description
is section details the steps of FST-FPM (fuzzy semantic trajectory frequency pattern mining) algorithm. e main idea of FST-FPM algorithm is that after scanning the semantic trajectory database once, according to the given minimum support threshold and fuzzy stay time membership threshold, all frequent itemsets with length of 1 are obtained, and then, each frequent sequence with length of 1 is used as a prefix to build its projection database respectively. After scanning the projection database, all frequent itemsets are obtained, and the prefix of the projection database is added, that is, the frequent sequence with length of 2. Finally, the projection database is recursively generated, and its frequent subsets are continuously mined until a new projection database cannot be constructed. e characteristics of FST-FPM algorithm are as follows: only when the support of the itemset is greater than the given minimum support threshold and the fuzzy stay time membership of the itemset is greater than the given threshold, the support of the itemset will be increased by 1, and the itemsets that do not meet the minimum support threshold and the fuzzy stay time membership threshold will be deleted to reduce the construction of the projection database. Algorithm 2 gives the complete FST-FPM pseudo code. 4 Scientific Programming Algorithm 2 inputs semantic trajectory database FD, minimum support threshold σ, fuzzy stay time membership threshold ρ, and fuzzy stay time membership function, and outputs all the frequent sequences that meet the conditions FTFP. e line 1 initializes each variable. e line 2 scans the fuzzy semantic track database FD. e line 3 calculates the support P of each semantic track. Lines 5 to 7 delete semantic tracks that do not meet the minimum support threshold and fuzzy stay time membership threshold. Lines 8 to 16 recursively generate projection database, and continuously mine frequent subsets until the projection database cannot be constructed. e line 19 returns all the fuzzy semantic trace frequent patterns FTFP that have been mined, where, Q i is used to store frequent sequences, i represents the number of iterations, i � 0, 1, ..., n. C w represents the projection database, stores the projection database of each frequent itemset T, w � 1, 2, ..., n. T k represents the frequent itemsets mined each time, k � 1, 2, ..., n. T i represents a frequent sequence of frequent itemsets T k combined with their prefixes. Record start time t si (5) Judge the area where C i is located loc i (6) if loc i ≠ loc i−1 (7) Record end time t ei (8) if t ei − t si > z (9) Output stay point R (10) end if (11) end if (12) end for (13) end for (14) end for (15) return R ALGORITHM 1: Stay point extraction.
(i) Input: fuzzy semantic trajectory database FD, minimum support threshold σ, (ii) fuzzy stay time membership threshold ρ, fuzzy stay time membership function (iii) Output: fuzzy semantic locus frequent pattern set FTFP (1) initialize variable i � 1; w, k � 1, 2, · · · n (2) FTFP � ∅ (3) scan fuzzy semantic trajectory database FD (4) calculate the support P of each sequence (5) calculate the fuzzy stay time membership S(Δt) of each sequence accord to the time membership function (6) if P < σ and μ A j (Δt) < ρ (7) the sequence is less than the minimum support threshold and fuzzy stay time membership threshold (8) define the sequence as an infrequent itemset and delete it (9) create a new sequence database Q i and add frequent itemsets with length i to the database (10) for each frequent itemset T in the sequence database Q i (11) construct the projection database C w of frequent itemset T (12) get the frequent itemset T k of C w (13) T and T k are constructed as frequent sequences T i with length i (14) FTFP � FTFP ∪ T i (15) if projection database C w is not empty (16) repeat steps 8-12, i � i + 1 (17) else output frequent sequence FTFP

Introduction to Data Set and Experimental Environment.
e experiments were conducted using the public dataset [18] (hereinafter referred to as the Geolife dataset) in the Geolife project of Microsoft Research Asia and the real target trajectory dataset (hereinafter referred to as the RFID positioning dataset) collected from the RFID positioning subsystem of an integrated security system. e Geolife dataset includes trajectory data of 182 people over a five-year period. In the experiment, 50 people, 2611200 sampling points, and 1518 tracks were selected. According to the track distribution, the area was divided into 6 areas, numbered 1-6. After de-noising the dataset, 2293 stay points were obtained through algorithm 1.
e RFID positioning dataset includes 10 targets, 5488 sampling points, and 916 semantic tracks. After denoising the dataset, 2744 stay points were obtained by extracting stay points through algorithm 1. e positioning area is divided into 6 areas, numbered 1-6. e semantic trace form is defined in Definition 4.
An overview of the collated dataset is shown in Table 1. e semantic track data form is shown in Table 2. List different stay points in a track of a target by line. For example, the first three lines of Table 2 record the first track of target 1, indicating that target 1 stays in area 1 from 09 : 00 to 11 : 30, area 4 from 13 : 00 to 18 : 00, and area 5 from 19 : 00 to 19 : 40.
is experiment was conducted on a computer configured with core i5-6500 (3.20 GHz dual core) and 8 GB memory. e system was win10 and the program is written in Python (3.8.0) + Pycharm.

Data Preprocessing.
After sorting the original track traj into semantic track ST, the membership degree of fuzzy stay time is added to it. Firstly, the stay time of each stay point is calculated according to Definition 3, and then, the fuzzy stay time membership function defined in this paper is used to calculate the fuzzy stay time membership degree of each stay point. According to the application scenario, the stay time is where, Δt is in minutes. us, the membership function of fuzzy stay time is obtained, as shown in Figure 1.
Five fuzzy stay time membership degrees of each stay point are calculated through the fuzzy stay time membership function. As shown in Table 3, the fuzzy stay time membership degrees of some stay points in the case of short stay time are obtained. In this way, the fuzzy semantic trajectory FST defined in Definition 7 is obtained.

Running Time Comparison. Previous studies have
shown that the operation efficiency and frequent pattern mining effect of the algorithms improved based on Prefix-Span algorithm are better than those of the same type. erefore, this section compares the running time of FST-FPM algorithm with PrefixSpan [10] algorithm, PrefixSpanx [12] algorithm and LFFT2 [13] algorithm on the two datasets in the same experimental environment. e running time of each algorithm on two datasets is observed by setting different minimum support thresholds. In this experiment, the running time of each algorithm is compared when the minimum support threshold is 20%, 30%, 40%, and 50%, as shown in Figure 2 and 3. Figure 2 shows the running time comparison of FST-FPM algorithm with PrefixSpan algorithm, PrefixSpan-x algorithm and LFFT2 algorithm on the two datasets when the fuzzy stay time membership threshold is 0.2. Figure 3 shows the running time comparison of FST-FPM algorithm on the two datasets when the fuzzy stay time     membership threshold is 0.2, 0.4, 0.6, 0.8, and 1.0, respectively. e minimum length threshold of frequent sequence of PrefixSpan-x algorithm is set to 1, and the stay time threshold of FST-FPM algorithm is set to 5 minutes. Figure 2(a) describes the comparison of the running time of FST-FPM algorithm with PrefixSpan algorithm, Prefix-Span-x algorithm, and LFFT2 algorithm when the fuzzy stay time membership threshold is 0.2 and the minimum support threshold is 20%, 30%, 40%, and 50%, respectively, on the Geolife dataset. It can be seen that the running time of each algorithm decreases gradually with the increase of the minimum support threshold. As the minimum support threshold increases, the number of frequent blurring decreases gradually. Figure 2(b) describes the comparison of the running time of FST-FPM algorithm with PrefixSpan algorithm, PrefixSpan-x algorithm, and LFFT2 algorithm when the fuzzy stay time membership threshold is 0.2 and the minimum support threshold is 20%, 30%, 40%, and 50%, respectively, on the RFID positioning dataset. It can be seen that with the increase of the minimum support threshold, the running time of each algorithm decreases gradually. When the minimum support threshold is 20%, the running efficiency of FST-FPM algorithm is significantly better than PrefixSpan algorithm, and slightly higher than PrefixSpan-x algorithm and LFFT2 algorithm.
It can be seen from Figure 2 that the FST-FPM algorithm proposed in this paper has better mining efficiency when there is more mining. When the minimum support threshold is 20%, the running time of FST-FPM algorithm is 54% less than that of PrefixSpan algorithm, 23% less than that of LFFT2 algorithm, and 10% less than that of Pre-fixSpan-x algorithm. Figure 3(a) describes how the running time of FST-FPM algorithm changes with the minimum support threshold when the fuzzy stay time membership threshold is 0.2, 0.4, 0.6, 0.8, and 1.0 on the Geolife dataset. It can be seen that with the increase of the minimum support threshold, the running time of FST-FPM algorithm under various fuzzy stay time membership thresholds gradually decreases, and the higher the fuzzy stay time membership threshold, the shorter the running time. Because the higher the membership threshold of fuzzy stay time is, the higher the constraint on frequent itemsets is, the less projection databases are constructed, and the higher the operation efficiency is. Figure 3(b) describes how the running time of FST-FPM algorithm changes with the minimum support threshold when the fuzzy stay time membership threshold is 0.2, 0.4, 0.6, 0.8, and 1.0 on the RFID positioning dataset. It can be seen that with the increase of the minimum support threshold, the running time of FST-FPM algorithm under various fuzzy stay time membership thresholds gradually decreases, and the higher the fuzzy stay time membership threshold, the shorter the running time. When the minimum support threshold is greater than 20%, there is little difference in the running time, because the number of mining is similar.
It can be seen from Figure 3 that the running time of FST-FPM algorithm will gradually decrease with the increase of fuzzy stay time membership threshold. When the membership threshold of fuzzy stay time increases by 0.2, the running time of FST-FPM algorithm decreases by about 15%.

Mining Fuzzy Semantic Trace Frequent Patterns.
e FST-FPM algorithm proposed in this paper is used to mine frequent patterns of fuzzy semantic trajectories. In order to explore the impact of minimum support threshold and fuzzy stay time membership threshold on mining, the number of mining under different minimum support threshold and fuzzy  Fixed FST-FPM fuzzy stay time membership threshold unchanged, set different minimum support thresholds for multiple groups of comparative experiments, and compare the number of five fuzzy stay time cases mined under different minimum support thresholds. When FST-FPM algorithm fuzzy stay time membership threshold is set to 0.4, 0.6, 0.8, and 1.0, respectively, the mining quantity is shown in Figure 4 and 5. e four graphs in Figure 4, respectively, describe the mining quantity of Geolife dataset under different minimum support thresholds when the fuzzy stay time membership thresholds are 0.4, 0.6, 0.8, and 1.0.
It can be seen from the four figures in Figure 4 that with the increase of the minimum support threshold, the number of mined data gradually decreases. When the minimum support threshold is 20% and 30%, more mining can be done. When the minimum support threshold is greater than 30%, the number of mining decreases sharply. When the minimum support threshold is 50%, only a small amount can be mined. e data mined in the Geolife dataset are mainly those with short and medium stay time. It can be seen from the four figures in Figure 4 that with the increase of fuzzy stay time membership threshold, the number of people e four diagrams in Figure 5, respectively, describe the mining quantity of RFID positioning datasets under different minimum support thresholds when the fuzzy stay time membership thresholds are 0.4, 0.6, 0.8, and 1.0.
From the four figures in Figure 5, we can see that with the increase of the minimum support threshold, the number of mining data gradually decreases. When the minimum support threshold is 20% and 30%, more mining can be done. When the minimum support threshold is 40% and 50%, it is almost impossible to mine. It is mainly due to the characteristics of RFID datasets, mainly for target tracks with medium and long stay time and the support is not high. When the minimum support threshold is greater than 40%, few meet the conditions. It can be seen from the four figures in Figure 5 that with the increase of fuzzy stay time membership threshold, the number of people with the same stay time level also decreases in equal proportion. When the minimum support threshold is 20% and 30%, the data mined from RFID datasets are mainly those with medium or long stay time. When the minimum support threshold is greater than 40%, it is almost impossible to mine.
It can be seen from Figures 4 and 5 that with the increase of the minimum support threshold, the number of the same fuzzy stay time type is reduced in equal proportion, and the total number is greatly reduced, because with the increase of the minimum support threshold, the requirements for the frequency of are higher, thus reducing the number that meets the requirements.

Conclusions
In this paper, a fuzzy semantic trajectory frequent pattern mining method is proposed for the moving trajectory data and stay time of targets in security environment. Based on the fuzzy set theory, the stay time of the target is divided into five grades, and the membership degree calculation method of the fuzzy stay time is defined. e FST-FPM algorithm is proposed and tested on the Geolife public dataset and the RFID positioning dataset of an integrated security system. Experimental results show that FST-FPM algorithm can effectively mine frequent patterns with fuzzy semantics, and the running time is reduced by 54% compared with Pre-fixSpan algorithm, 23% compared with LFFT2 algorithm, and 10% compared with PrefixSspan-x algorithm. It is proved that FST-FPM algorithm can help analyze the behavior characteristics of targets in security environment, and plays an important role in evaluating the threat degree of targets.

Data Availability
e data used to support the findings of this study can be downloaded from the website below https://www.microsoft. com/en-us/download/details.aspx?id�52367.

Conflicts of Interest
e authors declare that they have no conflicts of interest.