Trajectory Mining-Based City-Level Mobility Model for 5G NB-IoT Networks

Due to the large coverage of 5G NB-IoT networks, a more realistic mobility model for a macroscopic scene will greatly facilitate the development of optimal radio resource management algorithms. However, models devised for a random motion scene are no longer applicable in circumstances. Therefore, in this paper, a city-level mobility model is proposed based on the feature mining of the real trajectory of vehicles in the city of Shenzhen. The proposed model is separately designed in the motion trajectory to reduce the mutual influence between the time and spatial sequence. Simulation results show that it can better present specific node motions with the physical constraints of the city layout, which are motivated with a high degree of fit in terms of self-similarity, hotspots, and long-tail features.


Introduction
In order to meet the demands of new streaming media services and the IoT, 5th generation wireless system (5G) network is facing more performance challenges, such as exponential growth of traffic consumption, higher network transmission rates [1], and capacity for more network access devices [2]. Currently, 5G technology is focused on enhanced mobile broadband (eMBB), enhanced machine-type communication (eMTC), and critical communications (URLLC) [3]. Each scenario has different performance requirements, such as eMBB requiring high data rates, massive machinetype communication (mMTC) requiring high connection density for large-scale deployments, and URLLC requiring ultralow latency [4]. The emergence of 5G technology makes it possible to realize the IoT and the Internet of Vehicles in the future, which will create an intelligent world with the Internet of Everything. To deploy a large-scale IoT network, it shall consider the capacity of the network nodes, network congestion, network load balancing, network transmission rate, and reliability problems [5]. These problems for 5G network and IoT deployment proposed many major challenges, and how to consider mobile network node to the influence of the network is a key to solve the above problems. The transmission performance of the mobile opportunity network depends largely on the mobility of the network nodes [6]; the impact of user mobility on 5G small cell networks is enlarged with the decrease of the cell coverage radius [7]. Therefore, when deploying 5G networks and the large-scale IoT, the mobility of network nodes should be fully considered, which is a problem that cannot be ignored in the successful deployment of networks [8]. For example, how to provide a stable and high-speed network transmission service when vehicles and people are moving, and how to select and seamlessly switch the nearest cellular network when network nodes are moving, all these problems must be considered when designing the network. In popular, most researchers tend to create a mobility model to simulate the movement of network nodes and use the mobility model to simulate the operation state of the network and evaluate the operation effect of the network, so as to gradually improve the design and deployment of the network [9].
In the past decade, scholars have made endless studies on mobility models, which can be divided into two main categories [10]. One of them is a synthetic model based on some statistical mathematical assumptions and experience of the network node mobile observation, and a synthetic model can as far as possible need not the real trajectory data that simulate the movement of mobile network node under the real space. With the passage of time, the speed and direction of the network node can reasonably change accordingly. This kind of model has been widely applied to self-organizing network protocol performance evaluation. The other is a trace model that corresponds to the actual trajectory data and the extracted trajectory data features. The trace model relies heavily on the authenticity of trajectory data and the effectiveness of the characteristic of trajectory data extraction. The trace model can as far as possible join trajectory data of mobile network nodes. The trace model can match the real situation of track data as much as possible. In the synthetic model, the movement of network nodes is random. For example, the speed, direction, and destination of network nodes are random. In the trace model, the movement of network nodes shows different regularity according to different node types or different track data. On the whole, the modeling of the synthetic model is relatively simple; however, this kind of model is limited to the random movement model and does not consider the restriction of the environment [11]. While it is difficult to create a trace model because of the complex behavior logic and low robustness of the model, due to the smaller coverage radius of the 5G cellular network, more access devices, and more complex mobility of network nodes, the traditional mobility models are gradually unable to meet the required performance requirements of simulation [12,13].
Therefore, the urban operating vehicle data is used as the raw trajectories, while the trajectory characteristics of the raw trajectories is used as the key elements for the construction of the mobility model: we propose a city-level mobility model, which will do some contributions for 5G IoT deployment and network performance evaluation in the future. Firstly, we create a random spatial position model based on the movement characteristics of the raw trajectories in terms of spatial position to generate the position coordinates of the trajectory points. Then, a random vehicle running time model is created according to the movement characteristics of the raw trajectories in terms of time series, which is used to generate the nonstrict time series of position points. Finally, based on the three characteristics of the trajectory, the generated trajectory of 3 models (proposed model, random walk model, and random waypoint model) and the raw trajectories are compared. The proposed model has a high degree of fit in terms of self-similarity, hotspots, and long-tail features.
The main contribution of this paper is that we propose a city-level mobility model based on the feature mining of the real trajectory of vehicles. Compared with the random waypoint model and random walk model, the proposed model has higher fitting degree in self-similarity, hotspots, and long-tail features. Therefore, the random vehicle movement model can be used to simulate the node movement of the 5G NB-IoT network in the whole city.
The rest of the paper is organized as follows. In Section 2, we introduce the related work of the mobility models. In Section 3, we create the vehicle random mobility model according to the features extracted from real data. In Section 4, we evaluate the performance of various mobility models. Finally, we conclude the paper.

Related Work
The mobility of nodes in a wireless network is regarded as the main obstacle to the network control and management. In order to solve these problems, many researches focused on how to describe the mobile type of mobility models, and many famous mobility models are developed, such as the random walk mobility model [14,15], random waypoint model [16], and random direction model [17]. These kinds of mobility models are classical and simple, but they are generally not directly used in practical applications, because they are limited when applied to wireless network protocols. Therefore, more complex mobility models are presented as follows.
A mobility model based on time series characteristics was created. Some researchers design a mobility model based on time series characteristics. Haas [18] proposed a boundless simulation area model. The node movement area of the boundless simulation area has no boundary. When the node reaches the boundary, it continues to move and appears from the other side of the area. Camp et al. [10] proposed a probabilistic version of the random walk mobility model, which utilizes three states to determine the moving direction. Hong et al. [19] proposed a mobility vector model. The vector nodes presented by two-dimensional components and the model are composed of some other basic frameworks, such as the location-dependent model, targeting model, and group motion model. Liang and Haas [14] proposed a Gauss-Markov model. In this model, the nth movement can be calculated with reference to the speed value, direction value, and a random variable of the ðn − 1Þth movement. Similarly, Bettstetter [20] proposed a smooth random mobility model that can also change the vector and direction of the node smoothly. The user mobility model based on the preferred speed and direction makes the node's movement both purposeful and random. Moreover, some researchers utilized multinodes and compose them into groups. Hong et al. [15] proposed an exponential correlated random mobility model. Musolesi et al. [21] proposed a mobility model founded on the social network theory. A weighted graph is used to represent the social relationship network, and the weight is used to measure the strength of the interaction  Wireless Communications and Mobile Computing between individuals. Individuals are easy to get acquainted with and become friends with individuals close to their geographic locations. This model also simulates this phenomenon. Lu et al. [22] proposed an architecture design of smart agriculture by exploiting the SWIPT and studied an energy efficiency optimization scheme, where the pairing of subcarriers and power allocation is jointly optimized. Bai et al. [23] proposed a Manhattan mobility model, with a geographically restricted mobility mode, which originated from the modeling of Manhattan neighborhoods and was used to model the movement of urban areas. Similarly, a city section mobility model simulates the movement patterns of pedestrians on highways or streets in urban areas. Markoulidakis et al. [24] put forward the city area, area zone, and street unit mobility models combined with the traffic theory. The metropolitan mobility model, national mobility model, and international mobility model proposed by Lam et al. [25] are part of a hierarchical mobility model. The three models focus on different regional levels. In the obstacle mobility model [26], obstacles such as building are modeled and utilize to restrict the movement of nodes. Davies [27] proposed an in-place mobility model. In the model, the simulation area is divided into many small areas, and different groups of nodes are active in different small areas. The model models multiple groups with different targets. The nodes in the group have similar targets and are allocated in a limited area. The group movement can use the reference point group movement model to calculate the next movement position of the group. Among these models, researchers concentrated on model space and time series to improve the similarity with the real environment.
However, the existing mobile random model cannot solve the deployment problems brought about by the era of 5G and the IoT. Liu et al. [28] utilize NOMA to guarantee QoS for the satellite IIoT. Ning et al. [29] utillize mining raw trajectories from operating vehicles to optimize the 5G network. In this study, we propose a new random model based on real-world data collection. Compared with raw data and the random walk model, the proposed model has a high degree of fit in terms of self-similarity, hotspots, and long-tail features. The specific method will be presented as follows.

Proposed Random Vehicle Mobility Model
In this section, we discuss the defects of the synthetic model and made full use of the advantages of existing data support. In this paper, some movement features extracted from real data are used to optimize the existing mobility model, and a random mobility model for vehicles serving people in cities is proposed.
The random vehicle mobility model focuses on those produced in the operation of vehicles in urban road space and running time of movement characteristics. Therefore, according to the spatiotemporal characteristics of the moving trajectory, this paper creates the random space position model and the random running time model, respectively, and then combines the results of the two random models to get the final result. and ω, R 1 , and R 2 are generated randomly, R 1 and R 2 are generated from a given range, and the random range of ω is 0 to 2π. A graphical representation of two position points generated in a random space position model is shown in Figure 1. Both P 1 and P 2 are generated based on random angles and random distances, but their random distances are basically kept within a range. It is worth mentioning that all position points are calculated based on the initial point ð0, 0Þ for random angles and random distances. According to the previously obtained hotspot as the initial point, a smaller linear velocity motion function is used again to create a new position point, so that the generated position points are randomly distributed around the initial point. If two position points are far apart, a new moving position point can be added in the two position points according to the given simulated vehicle speed and time interval. Algorithm 1 can get a random number of position points. Algorithm 2 can generate random position points.

Performance Evaluation
In this section, we test the performance of our proposed movement model based on moving vehicles in cities and compare it with the movement trajectory of real vehicles,

Wireless Communications and Mobile
Computing the random walk model, and the random waypoint model. The third part introduces the moving track data of real vehicles. The random walk model is a classic movement model, which is a simple model based on random direction and speed. The random waypoint model adds the concept of node pause time, and there is a pause time before the node changes its speed and direction. In this simulation, the random vehicle mobility model is the simulation model in pictures, and the parameters of the random walk model, random waypoint model, and random vehicle mobility model are set as follows.
For the random walk model, the end time is 86400 s, speed mean is 5.3 m·s -1 , speed delta is 5.3, and travel time is 30 s. For the random waypoint model, the end time is 86400 s, speed mean is 5.3 m·s -1 , speed delta is 5.3, pause time     4.1. Performance of Self-Similarity with Hurst Exponent. We first study the self-similarity performance of our proposed mobility model based on moving vehicles in cities. The Hurst exponent is used to evaluate the self-similarity of mobility models, and the R/S methods are full-blown implementations of the Hurst exponent algorithm. In our simulation, we compare the moving trajectory generated by the model we proposed, the raw moving trajectory, and the moving trajectory generated by the random walk model and the random waypoint model. In the simula-tion, we select 100 nodes to move and carried out horizontal, longitudinal, and distance self-similarity calculation for the movement of each node, respectively. As the number of nodes increases, we calculate the average value of the Hurst exponent of nodes. In pictures, it can be seen that the Hurst exponent of the trajectory data generated by the raw data, random walk model, random waypoint model, and random vehicle mobility model.
In pictures, it can be seen that with the increase of the number of nodes, the Hurst exponent of the three different movement trajectory data tends to be flat. In Figure 3, the Hurst exponent for horizontal (x), longitudinal (y), and distance (distance) is between 0.6 and 0.8. In Figure 4, the Hurst exponent of x and y is less than 0.6, and the Hurst exponent of distance is greater than 0.8. In Figure 5, the Hurst exponent of x and y is between 0.6 and 0.8, and the Hurst 7 Wireless Communications and Mobile Computing exponent of distance is greater than 0.8. However, the data generated by our proposed model, the Hurst exponent of x, y, and distance are all between 0.6 and 0.8 in Figure 6. This means that, the random vehicle mobility model, the performance of our proposed model is superior to the random walk model in the selection of direction and distance in cities.

Performance of Hotspots with Density-Based Clustering.
The main purpose of clustering is to detect the characteristics of node movement trajectory hotspots. Density-based spatial clustering of noise application (DBSCAN) can divide areas with sufficient density into several groups and find clusters of any shape in the noise spatial database, which is suitable for the moving track data of vehicles running in cities. In this section, we mainly apply the DBSCAN clustering method to detect moving hotspots and evaluate the performance of our proposed model. Raw data is the latitude and longitude information obtained by GPS, while the trajectory data generated by the model is two-dimensional coordinate point information of coordinate axes. In raw data, the random walk model, random waypoint model, and random vehicle mobility model, we all select the moving trajectory of a node to perform DBSCAN clustering operation. The clustering results of the movement trajectories of the four nodes are, respectively,   Table 1 shows the parameter selection of DBSCAN clustering. In Figure 7, it can be seen that hotspots are formed in the movement of vehicles in reality, while the node movement trajectory generated by the random walk model does not form hotspots in Figure 8. The moving track points of nodes generated by the random waypoint model all belong to outliers in Figure 9. However, in Figure 10, the node movement trajectory generated by the model we proposed forms hot-spots similar to the movement of vehicles in reality. The movement points of nodes generated by the random vehicle mobility model in a day is much smaller than that of the random walk model. This means that the movement of nodes generated by the random vehicle mobility model is closer to the real movement trajectory.

Performance of Long-Tails with Cumulative Distribution
Function. The long-tail is used to detect the short distance movement characteristics of the node. In order to evaluate In this simulation, we have chosen three different average speeds of nodes, and the cumulative distance distribution  function was applied to each node. Table 2 shows the range of different speeds. The cumulative distance distribution of the raw data, random walk model, random waypoint model, and the model we proposed are plotted in this pictures. In all pictures, the X axis represents the distance from the position point generated by the node to the center point, and Y axis represents the cumulative distribution of distance. In Figures 11-13, it can be seen that most of the cumulative distribution curves of raw data and the two models are steep at first and then gentle. The cumulative distribution curve of different nodes generated by the random walk model, every curve are divided into two curves by turning points, and their slope is not much different. However, in Figures 11 and 14, raw data and trajectory data created by our proposed model do not have this characteristic. It means that our proposed model is more flexible and realistic over short distances.

Conclusion
In this paper, we propose a new mobility model based on the operating vehicles in cities. In the whole model, we design the spatial sequence and time sequence in the moving trajectory separately, which has the advantage of reducing the interaction between the spatial sequence and time sequence. First of all, based on the short distance movement characteristics formed by the real vehicle movement tracks in the city, we designed an unstrict time series trajectory, so that the trajectory of any shape can be obtained. Second, based on the real time of vehicle movement in the city, we add the time series corresponding to the track to the loose time series. Finally, the simulation results prove that the random vehicle mobility model proposed by us is effective in generating the vehicle movement trajectory. Compared with the random waypoint model and random walk model, the proposed model has a high degree of fit in terms of self-similarity, hotspots, and long-tail features. Therefore, the random vehicle mobility model can be represented to the node motion features of 5G NB-IoT networks in the whole city.

Data Availability
Data is available on request from the corresponding authors.

Conflicts of Interest
The authors declare that they have no conflicts of interest.