Research on the Edge Resource Allocation and Load Balancing Algorithm Based on Vehicle Trajectory

Edge computing empowers the IoV to achieve performance requirements such as low latency and high computational load for in-vehicle services. However, the driving of vehicles is random and unevenly distributed, causing problems such as unbalanced load of edge servers and low edgeresource utilization. Therefore, in thisarticle, based on the vehicle trajectories, theedge resource allocation algorithm and load balancing algorithm are used to obtain the load prediction value of the edge server and then calculate the optimal edge resource quantity in order to reduce the resource idleness as much as possible. The experiments demonstrate that the application of the edge resource allocation algorithm and load balancing algorithm based on vehicle trajectory signiﬁcantly reduces the blocking rate of edge resource requests by vehicles and improves the beneﬁts of the overall IoV edge system.


Introduction
e Internet of vehicles (IoV) has emerged as a fundamental technique that substantially pushes forward the development of intelligent transportation [1]. e automotive applications that significantly benefit people's daily activities are increasingly abundant, such as accurate localization, green travel priority plan query, driving judgment, and entertainment information. is may lead to a considerable amount of data and inefficient service quality. With the increasing growth of built-in sensors and actuators, vehicles cannot fulfill the incremental information processing transmitted from the surrounding environment. In general, the IoV services are most delay-sensitive, while vehicles are not integrated with enough computing resources to meet the real-time requirements, leading to the high traffic demands from vehicles to the cloud center. However, the option of requesting to the distal cloud may contribute to the congested backhaul link and unstable connections [2]. As one of the key enabling technologies in the 5G era, the edge computing technology can reduce the cloud computing load and data processing delay by deploying sufficient computing resources as well as adequate storage capacities in the proximity of vehicles. With the configuration of EC, the IoV services can be offloaded to the ENs rather than the remote cloud center, and the transmission latency resulting from the cloud is reduced [3]. However, how much edge resources to deploy is a question, due to the high cost and limited amount of resources deployed at the edge. erefore, designing an efficient edge resource allocation and selection algorithm will help improve the success rate of the vehicle's request for edge resources. e authors in [4] propose a proactive resource allocation approach, in which all the cells that are one-hop away from the current cell will be provisioned with resources. e authors in [5] propose to allocate resources to a subset of neighboring cells, in which each cell is marked with different weight values that indicate the handover probability. In [6,7], a load balancing algorithm between different cells is proposed. is algorithm uses the queueing theory and iterative methods to find the task redirection flow from an overloaded edge server to an unloaded edge server and alleviate the randomness of vehicle driving caused by the unbalanced loads among edge servers and the low resource utilization. In order to incorporate the spatial information of QoS into data prediction and recommendation, the study presented in [8] puts forward a location-aware nonnegative tensor factorization technique, in which a location stamp is attached to each piece of QoS data. e location-aware personalized CF method proposed in [9] leverages both locations of users and web services when selecting similar neighbors for the target user or service. e authors in [10] propose an optimization framework based on stochastic traffic analysis, to minimize the cost of resource provisioning under the premise of ensuring that the service blocking probability is less than a predefined threshold, in order to improve the utilization rate of edge resources. In [11], a citywide and real-time model is proposed for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history, as well as map data sources. In [12,13], the authors show that each edge server can retrieve data from other edge servers to serve users with a low latency guarantee when another edge server maintains relatively adequate resources in the same area. e authors in [14] propose a selection algorithm between vehicles and edge nodes based on mobile information, server available resources, and Quality of Service (QoS) conditions of service. e algorithm sorts these edge servers around the vehicle based on distance and service delay, in order to reduce the response delay at the edge system of the IoV. Referring to [15], a powerful way to reduce the completion time of a request in the mobile edge environment is to offload its tasks to nearby cloudlets, which consist of clusters of computers. e authors in [16] propose the edge resource allocation algorithm based on the improved vehicle trajectory prediction, the and traffic statistics method is used to obtain the load prediction value of the edge server; then, the optimal number of edge resources is calculated on the premise of reducing resource idleness as much as possible. e authors in [17] propose the task offloading problem from a matching perspective and aim to optimize the total network delay and a pricing-based one-to-one matching algorithm and pricingbased one-to-many matching algorithms for the task offloading. ree load sharing schemes, namely, no sharing, random sharing, and least loaded sharing, are mentioned in [18], which exploit the collaboration between clustered servers in different degrees, and the simulation experiment is conducted by queuing theory.
However, these methods do not integrate the calculation of minimum edge resources with the load balancing among edge nodes. In order to incorporate the keeping of QoS into the allocation and selection methods of IoV edge resources, it is challenging to achieve the trade-offs between reducing the expenditure of edge servers and maximizing the success rate of requesting to edge server by connected vehicle. is article is the first attempt to adjust the optimal edge resource in each edge node. e main contributions are summarized as follows: (i) An optimization framework is developed in order to minimize the cost of configuring edge resources to compute the minimum amount of edge resources in each cell and in each period, while the service blocking probability is guaranteed to be smaller than a predefined threshold (ii) A novel method is proposed for adjusting the load balancing of different edge servers and making a distribution of the minimum amount of resource among these edge servers, in order to improve the overall benefits of the edge system of the IoV (iii) An algorithm is developed for connected vehicles to select edge nodes, and then, a set of experiments are performed to evaluate the validity and efficiency of the edge resource allocation and selection method based on vehicle trajectory prediction

Model of the Edge Computation System for IoVs
In order to show the operational process in IoVs, this article assumes the model of edge computation system for IoV as shown in Figure 1. e model is a three-layer architecture, generally including a user layer (data generation), edge layer (data filtering and decision-making), and cloud layer (data fusion, analysis, and macroadjustment). In Figure 1, the vehicle moves among different adjacent cells. In each cell, there is an amount of edge devices configured servicing passing vehicles covered by the cell, such as roadside base stations (RSU, distributed on both sides of the road) and edge servers. e cloud computing center monitors and adjusts all the operational processes of edge services in all the cells. When a vehicle will soon leave cell A to cell B, the wireless connection between the vehicle and a road site unit (RSU) in cell A interrupts when the connection to B is built.

Trajectory Prediction Based on Historical Trajectory Data
Assume a working day is divided into a sequence of periods at equal intervals, where T is the total number of periods. In the interactive environment between the actual vehicle and edge system, the GPS trajectory data can be obtained by the real-time navigation and positioning system, and the future trajectory is predicted based on the fine-grained GPS trajectory data, in order to obtain the movement prediction of the vehicle between different communities. e geometric method is used to predict in advance the location of the vehicle at the beginning of the next period, based on the vehicle's location, speed, and direction: Since the number of vehicles covered by a cell in different periods is changing, in order to facilitate the comparison between the experimental prediction results and actual results, the fleet distribution vector is defined as S n (t)(0 < n ≤ N; 0 < t ≤ T) and the vehicle-mounted distribution vector is denoted by SF n (t)(0 < n ≤ N; 0 < t ≤ T). ey are introduced in equations (1a) and (1b), which represents the actual number of vehicles and the total onboard capacity of vehicles at the beginning of period t covered by cell n, usually determined by the GPS positioning system. Note that N denotes the total number of cells divided in the area: HF⋮ ≔ HF 1,1 (t) HF 1,2 (t) · · · HF 1,N (t) In this article, based on the trajectory prediction of GPS trajectory data, the fleet mobility prediction matrix and vehicle-mounted mobility prediction matrix are obtained, as shown in equations (2a) and (2b), where H i,j (t) is the number of vehicles transferring from cell i to cell j in period t, and HF i,j (t) represents the vehicle-mounted capacity of moving from cell i to cell j in period t, which is convenient to distribute requests among vehicle RSU and Cloudlet.

Analysis of Vehicle Arriving and Leaving
Based on the fleet mobility matrix obtained in Section 3, in order to improve the success rate of the vehicle requesting edge resources in the actual situation, it is necessary to estimate the time point when vehicle v removes from cell n to the neighboring cell n ′ (i.e., the time when the vehicle reaches the cell boundary). Since it is almost impossible to accurately estimate the vehicle's travel time, this article uses the probability distribution function of the arrival time, which represents the probability distribution of the arrival time of the vehicle from the perspective of probability [14]. Similarly, the probability distribution function of the departure time of the vehicle is used to represent the probability distribution of the departure time of the vehicle.

Analysis of Vehicle Arriving Flow and Leaving Flow. P arr
n,t and P dep n,t represent the probability distribution of the arrival time entering cell n and the probability distribution of the departure time leaving cell n of a vehicle in period t, respectively. In this article, the normal distribution is used to represent the probability distribution of the arrival time and departure time, as shown in Figure 2. In fact, based on the fleet mobility matrix H (cf. Section 3), the number of vehicles accounts for the majority in period t when arriving at cell n, and a small number of vehicles do not arrive according to the results given by the vehicle trajectory prediction model.
Assuming that the arrival time of all the vehicles arriving in cell n in period t follows the same normal distribution, the expected number of vehicles arriving in cell n during period t is denoted by λ n (t). e computation is shown in the following equation: where each integral represents the probability of each vehicle arriving at cell n in period t, and M � N i�1,i≠n H i,n (t). Similarly, the expected number of vehicles leaving cell n during period t is denoted by μ n (t): where each integral represents the probability of each vehicle leaving cell n in period t, and M � N j�1,j≠n H n,j (t). As the vehicles arrive and leave, the number of vehicles covered by a cell is changing. e expected arrival/departure number of vehicles cannot be directly used to calculate the optimal number of edge resources for each cell. Because the calculated expected average does not take into account all the possible values, if the actual number of vehicles arriving in cell n exceeds the expected average value, some vehicles may fail to request the edge resources of cell n.

Dynamic Analysis of Traffic
Flow. Due to complex traffic conditions, different driving habits of different drivers and human subjective factors, it is inevitable that the vehicles will have high mobility problems. In fact, it is difficult to accurately calculate the number of resources that the edge devices need to provide. Even a big gap exists between the estimated number of vehicles covered by a cell and the actual number. In order to analyze the flow of vehicles arriving and leaving cell n at period t, the vehicle arriving and leaving flows are modeled as Poisson processes, in which the parameters of the two Poisson processes are λ n (t) and μ n (t), respectively.

Vehicle Arriving Flow.
A random variable D n (t) is defined in period t indicating the number of vehicles where � B n (t) represents the maximum possible number of vehicles during period t arriving at cell n, which is usually the upper bound of D n (t).
e calculation formula is shown in the following equation:

Vehicle Leaving Flow.
A random variable U n (t) is defined in period t indicating the number of vehicles leaving from cell n, and aiming at considering the number of all the possible values, as shown in the following equation: such that 0 ≤ x ≤ B n (t), where B n (t) represents the maximum possible number of vehicles during period t leaving from cell n, which is usually the upper bound of U n (t). e calculation formula is shown in the following equation:

Vehicle Composite Flow.
A random variable R n (t) is defined to consider both the arriving flow and leaving flow while representing the difference between the number of arriving vehicles and the number of leaving vehicles at period t in cell n (cf. equation (9)). It indicates that the required number of edge resources is greater than the number of edge resources to be released.
In order to consider all the possible values, a joint probability distribution function R n (t) is defined to represent the probability distribution of the difference between two independent streams. It is expressed in the following equation: where 0 ≤ x ≤ � B n (t).
Within a short period of time in cell n, the arriving flow can be considered independent of the leaving flow. erefore, equation (11) can be derived from equation (10).

Traffic Flow Prediction Model
Based on the prediction of the future trajectory of the vehicle, forecasting based on historical traffic flow can improve the success rate of the vehicle requesting edge services. In this article, according to the traffic statistics of the vehicle trajectory data, the historical traffic flow of the vehicles exhibits specific laws, such as the periodic similarity and correlation of the traffic flow. e experimental part of the traffic flow statistics is presented in Section 10.2. Although the short-term traffic flow has the characteristics of nonlinearity, uncertainty, and randomness, which brings certain difficulties to the prediction of the traffic flow, its characteristics such as periodic similarity, correlation, and self- probability distribution function of arrival time probability distribution function of departure time  Complexity organization are convenient for forecasting the traffic volume.

Periodic Similarity of the Traffic Flow.
Due to the living habits of travelers, the overall traffic flow distribution presents specific laws. In addition, the traffic flow of some road sections has a high degree of similarity in two different periods.

Correlation of the Traffic Flow.
e correlation of the traffic flow in time is reflected by the fact that the traffic flow value of a community is affected not only by the previous period, but also by the traffic flow of the next period. e traffic flow value of a community in several consecutive periods shows a higher similarity.

Self-Organization of the Traffic Flow.
All the vehicles follow the same principle in driving, and they all hope to smoothly and quickly reach the destination in order to maximize the group benefits. is phenomenon is referred to as self-organization.

Traffic Flow Statistics.
According to the cells where a vehicle appears at the beginning of t 1 , t 2 , and t 3 in t, the traffic volume of each cell in t is counted. If a vehicle appears in the same cell at the beginning of the three small periods, only one vehicle is added to the cell. Otherwise, these cells where a vehicle occurs will be added. Using this method, the number of vehicles in each cell and in each period will be obtained.

Prediction of the Traffic Flow.
Based on the characteristics of the traffic volume R n (t) < 0, the traffic volume value in period t of cell n is related to the statistical average value of the traffic volume of the continuous-time periods of the last week in cell n, the statistical average value of the traffic volume of the continuous-time periods of the previous day in cell n, and the statistical average value of the traffic volume of the recent continuous periods in a certain extent. Since the linear model can be easily modeled and has a high speed and stable operation, the prediction of the traffic volume is solved by the linear model as where the input is a vector x t,n � (x t,n,1 , x t,n,2 , x t,n,3 ) T , w n � (w n1 , w n2 , w n3 ) T represents three weight values, D 1 (i, n) denotes the traffic volume at period i in cell n of the last week, D 2 (i, n) is the traffic volume at period i in cell n on yesterday, D 3 (i, n) represents the traffic volume at period i in cell n on today, b n is a constant, k represents the length of the selected periods, and the output f t,n (x) denotes the estimated traffic volume in cell n during period t.
In order to instantiate the linear model of traffic flow prediction, the dataset in the linear model is constructed using the traffic flow statistics of the vehicle GPS trajectory data used in this article. e optimal value of w n and b n is then obtained using the least squares method. Finally, the generated model is applied to forecast the traffic volume at the next period on cell n based on the existing traffic volume records. In order to simplify the calculation, the dataset is represented by a matrix of X n with U × 4 size and a vector of y n with U × 1 size, as shown in equations (13a) and (13b), with vector w n � (w n , b n ) absorbing w n and b n . e first three columns of each row in X n are x t,n,1 , x t,n,2 , and x t,n,3 in different periods in cell n, and y i,n in y n corresponds to the actual traffic flow at period t in cell n: y n � y 1,n , y 2,n , . . . , y T,n , (13b) Different weight vectors w and the optimal value b for different cells are obtained using equation (14) and then substituted into equation (12) in order to predict the traffic flow of a cell in the next period, so as to improve the load prediction accuracy of the edge device.

Resource Allocation Model Based on QoS
Based on the traffic flow prediction model, the QoS model is proposed to achieve a certain degree of success rate of the requesting edge resources by the passing vehicles, according to actual application requirements. is section uses the fleet distribution vector S(t), fleet mobility matrix H(t), vehicle arriving λ n (t), and vehicle leaving μ n (t) as input to cell n at period t, in order to predict the number of covered vehicles, and to divide the edge resource in each cell at the end of period t or the beginning of period t + 1.

Prediction of the Vehicle Traffic Flow.
Assuming that there are C n (t − 1) available edge resources contained in cell n at the beginning of period t, the probability of a request being blocked when the vehicle enters cell n from other cells is expressed in the following equation: where ϕ represents a blocking event. Some of the vehicles appear in cell n at the beginning of period t, and stay in cell n during the entire period t. ey will not make changes to the edge resources of the cell, because their edge resource requests have already been provisioned at the beginning of period t. It is demonstrated that these requests have been supplied by cell n, and maintain the supplement of edge resources at the beginning of period t +1. Considering the worst case, if all the entering vehicles are earlier than all the vehicles leaving in cell n at period t, the blocking rate of the edge resource request can be easily increased by directly configuring according to the edge resource configuration result obtained by equation (9).

Algorithm Improvement Based on the QoS Model.
e estimated value of C n (t − 1) should be determined at the beginning of t that has passed. In addition, the accuracy rate of the vehicle transfer prediction for the next period t + 1 based on the vehicle trajectory records in period t is low. erefore, the minimum number of edge resources of cell n at the end of period t or the beginning of period t + 1 is predicted based on the predicted transfer situation of vehicles in period t and the distribution of the fleet at the beginning of period t: erefore, the model based on QoS aims at computing the minimum number of edge resources of cell n at the end of period t, so that equation (16) is developed, where ϵ is the user-defined service blocking rate, which represents the estimated blocking rate of the vehicle requesting edge resources.

Optimal Resource Allocation.
Due to the fact that it is impossible to calculate the optimal value one by one in the case of large data volume, this article uses a grouping and binary search algorithm (two-stage algorithm) in order to calculate the minimum number of edge resources in period t and in cell n. e corresponding pseudocode is shown in Algorithm 1.
Using Algorithm 1, the optimal value of edge resources divided at the beginning of the next period of each cell is estimated. In addition, the optimal resources are allocated to each edge server according to the adjustment of load balance among different edge servers and the allocation of request redirection, in order to increase the success rate of the vehicles requesting edge resources.

Model Improvement Based on Traffic Flow Prediction.
e result f t,n is first obtained from the traffic flow prediction model in cell n and in period t using equation (12): It is then compared with the optimal value of edge resource C opti derived from Algorithm 1, in order to improve the prediction accuracy.

Request Allocation for the Edge System in IoV
Based on the nature of the dynamic connection between the vehicle and the edge node, this section proposes a request allocation model between the on-board terminals, edge servers and Cloudlet, so that the average processing time of each edge server in the cell is as equal as possible, in order to improve the performance of the edge system in IoV.

Model of Request Allocation for the Edge System in IoV.
Assuming that the total amount of messages received by cell n in period t is λ � λ e + λ v + λ c , it is distributed to edge servers, on-board units (OBUs), and Cloudlet. ree different levels of equipment then process the request sent by the vehicle, and send it back to the vehicle. e distribution table is presented in Table 1.
Due to the uncertainty of user demand and fluctuations in the number of requests received by each RSU in different periods, based on the study presented in [15,16], the number of arriving vehicles follows the Poisson process. erefore, this article assumes that the request flow arriving at each RSU from vehicles follows a Poisson process with an arrival rate of λ i . is article assumes that N, R, and M represent the number of Cloudlets in the targeted area, number of edge servers covered by each cell, and expected number of vehicles covered in each cell at the beginning of period t +1, respectively. e main terms of this section are presented in Table 2.
In addition, this article first analyzes the network performance in a cell as an example and then continuously adjusts the request allocation algorithm according to the experimental results. Afterwards, it is applied to all the other cells. e allocation graphic is presented in Figure 3, and the Cloudlet in cell n accepts the number of requests that fail in transmitting to edge server or on-board unit sent by vehicle, in order to relieve, which alleviates the emergency need for vehicular requests.

e Service Queue Model in IoV.
In the edge system of IoV, the total response time can be obtained by: where d up represents the uploading delay for a request from a vehicle to an RSU, d wait denotes the queueing delay for a request at the edge server wired to the RSU, d pro is the processing delay of the request at the edge server, and d down represents the feedback delay of the calculation result back to the vehicle:
Complexity 7 e edge system in IoV can be modeled as a queueing network. e idle probability in the queueing system with l homogeneous servers is expressed in equation (18), where λ represents the request flow waiting for a server to process, and μ denotes the fixed service rate so that the service intensity meets ρ � λ/lμ < 1 with an average queueing delay expressed in equation (19).

Queueing Model at Cloudlet.
e Cloudlet in cell n can be modeled as a M/M/b queue, with b homogeneous servers and fixed service rate μ c . Based on the previous assumption, the service intensity at Cloudlet in cell n is ρ c � λ c /bμ c , where λ c represents the number of requests to Cloudlet in the unit interval, and the average calculation delay per request is E(d pro ) � 1/μ c . A network delay d v⟶Cloudlet is incurred by a unit request sent by a vehicle transfer from RSU to Cloudlet, with an ignored feedback delay. e average response delay d c stol of unit request at Cloudlet can then be computed as is article assumes that each vehicle has an independent computational capacity to share processing requests. Given the number of requests arriving at RSU i with arrival rate λ i , the request flow in the waiting queue of moving on-board units covered by RSU i can be considered a subprocess of the request flow arriving at RSU i , with an arrival rate λ v i ≤ λ v . When a vehicle comes into the wireless communication range of RSU i , it picks up a request at the front of the waiting queue based on the on-board units and finishes processing before leaving the communication range of RSU i . Based on [10], the request flow in the waiting queue of moving on-board units can be modeled as a Markov chain, and the model can be considered a M/M/1 queue. e service intensity on the queue is computed as where the total vehicle capacity is μ v i � a j�1 μ j with a vehicles covered by RSU i . e average response delay d v stol of unit request on the on-board unit is only related to λ v i and μ v i . Its value is calculated using the following equation: Here, the delay d v⟶v is incurred by a unit request transfer from a vehicle to another.

Allocation of Requests Based on Average Delay
Minimization. Due to the uncertainty of the trajectory prediction method based on the historical trajectories, the base stations that the vehicle may be connected to at the beginning of the next period form a cluster in order to alleviate the error caused by the vehicle entering different cells with different probabilities in the next period. Assuming that a vehicle submits requests λ i , i ∈ M in period t, the average response delay is then calculated using equation (23). is indicates that the request λ i sent by the vehicle can be divided into three different requests erefore, the optimal edge resource allocation problem in IoV is defined as follows. Under the premise that the minimal value of edge resources C min in cell n and RS i�1 b i edge servers is deployed, allocating a certain volume of resources of each edge server so that the average response delay in cell n is minimized. e optimization formula is shown in equation (24), where M � N i�1 H i,c (t − 1) represents the expected number of vehicles covered by cell n in period t, and MF � N i�1 HF i,c (t − 1) denotes the expected total vehicular capacity covered by cell n in period t: 8 Complexity Since multiple variables in the optimal allocation problem are tightly coupled, it is complicated to solve the response time and request allocation and redirection problems. erefore, in this article, two subprocesses of request allocation optimization and request redirection are used to approximate the optimal allocation problem.

Load Balancing of the Edge Computing System in IoV
It can be deduced that equation (24) is a nonlinear optimization problem, which is difficult to promptly solve in period t. erefore, the general idea of the load balancing of the edge computing system in IoV is as follows. (1) According to the requests received in the next period t +1 and in cell n, the amount of requests allocated to on-board units and the amount of requests allocated to edge servers are calculated using the incremental cost method. (2) is article first computes the amount of redirected requests at each RSUs in the next period t +1 in cell t, making these average response delays among all the clusters in cell n similar to d ave c,t . (3) According to the actual trajectory of the vehicle, the vehicle decides to select and unload one of the optimal edge servers set, which is configured with the optimal number of edge resources, or to Cloudlet if the unloading fails. (4) e actual blocking rate of requests for edge resources by vehicle will be calculated, and the minimum number of servers required in Cloudlet is determined.

Minimization of the Average Response Delay.
Based on the previous assumption, cell n contains RS RSU clusters and the total capacity of M vehicles covered expectantly in period t. erefore, the total requests add up to λ � M i�1 λ i generated at cell n in the beginning of period t. ese overall requests are allocated through the incremental cost method with the following objective: According to equation (25), the condition that the number of requests allocated to the on-board units can be completed within a period is satisfied. In addition, the vehicle then unloads the rest of the requests to the Cloudlet, if the unloading at edge server is in failure, using the cost incremental method expressed in the following equation: Here, w is the number of requests unloaded to the onboard units, and the remaining λ − w is the number of requests unloaded to edge servers.
According to the calculation result, the allocation plan is stored in each RSU in cell n, and the edge resource request sent by the vehicle is given as a reference according to the actual vehicle trajectory. e remaining requests λ − w are allocated to RSUs i , i ∈ RS according to the GPS historical trajectory. In other words, the vehicle will be the most likely to submit a resource request to the optimal RSU.

Load Balance among Edge Servers.
e number of requests unloaded from vehicles in RSUs i , i ∈ RS in different periods is unstable, due to the driving randomness, and the states of idleness or busyness are inevitable. erefore, it is assumed that the RSUs can access each other. Even RSUs redirect a part of requests contained to another RSUs. e average response delay for each RSUs in cell n is defined as In order to obtain the number of requests from the overloaded cluster to unloaded cluster between RSUs i , i ∈ RS, the transmission delay caused by request redirection is ignored. is article assumes that the expected minimum average response delay D in cell n is first calculated, and the number of redirected requests generated from an overloaded RSUs to an unloaded RSUs is then determined, so that the average response time in cell n is close to D. Finally, the value of D is adjusted according to the difference between the outgoing requests from the total overloaded clusters, and the incoming requests to the total unloaded clusters. e initial value of D is estimated by D � (T max + T min )/2, where T max � t stol (λ e i ))|i ∈ RS and T min � t stol (λ e i ))|i ∈ RS . All the RSUs cluster are then partitioned in two disjoint sets: the set of overloaded clusters Vo � i|t stol (λ e i )) > D and the set of unloaded clusters Vu � i|t stol (λ e i )) < D . erefore, the definition of the load balancing problem in IoV is provided in the following equation: Here, ϕ j is the number of the requests redirected from RSUs i , i ∈ V o , ϕ j represents the number of requests redirected into RSUs j , j ∈ V u , and δ denotes a given threshold, which indicates the allowable difference between the average response delay after redirection and the value of D. e values of ϕ i , i ∈ V o and ϕ j , j ∈ V u are approximately calculated using the iterative algorithm.

e Request Redirection between Edge Servers.
Based on the previous results of request redirection, the problem of the number of requests generated from RSUs i , i ∈ V 0 to RSUs j , i ∈ V u is transformed into a problem of finding the minimum cost and maximum flow in terms of network delay. us, the optimization objective is expressed as arg min where r(i, j) represents the number of requests redirected from RSUs i , i ∈ RS to RSUs j , j ∈ RS, c ij is the transmission delay among different clusters, c ij � cd ij , c is the proportional coefficient between transmission and distance, and d ij is the Euclidean distance between RSUs i and RSUs j . is article uses the Vogel algorithm to calculate a suitable value of r(i, j) in time, in a limited period t, based on equation (29), and adjust the optimal volume of edge resource at each RSU according to the optimal value of edge resources C opti in cell n.
Under the precondition that the amount of deployed edge resources remains unchanged, the virtual amount of resources of each edge server can be adjusted according to the redirection result. Especially when the maximum resource amount of an edge server does not meet the requests issued by the surrounding vehicles, a part of the requests can be redirected to another edge server, so that passing vehicles can increase the success rate of unloading to the RSU, within the limited wireless communication range.

Dynamic Selection of Edge Nodes by Vehicles
For each edge server, the amount of available resource is in a dynamic change in different period t, with resources continuously occupied and released as the vehicles move. For each connected vehicle, the selection of edge server mainly considers the following factors: the amount of available resource of the edge server, representing the maximum amount of requests that can be offloaded to the edge server, the distance between the vehicle and the edge server, usually affecting the transmission delay of a request, and whether the response delay of the vehicular requests proceeded by the edge server and on-board unit of the vehicle itself can be met within a limited time.
is is usually jointly determined by the amount of requests sent by the vehicle and the available resources of the edge server.
erefore it assumes that vehicle v (with vehicular capacity μ v ) decides to offload itself requests (the amount of the requests is λ q ) to the edge server e (the amount of available resources of edge server e is μ e ) at a time point, the delay return function satisfies: Here, R(v, e) represents the delay result when vehicle v selects edge server e. e constraint condition indicates that the response delay at the on-board unit is slightly higher than the selected edge server in the actual offload process during driving. e main thought of dynamic selection for a vehicle to RSU is to first determine the selectable RSU list within a radius of r d , sorted by distance in ascending order, and then in each iteration selection to determine the delay result R(v, e i ) such that R(v, e i ) < T threshold , where T threshold usually represents the length of period t, and finally confirms to the offload to the selected RSU or to the Cloudlet. e pseudocode of algorithm 2 for dynamic selection for vehicle v in period t is also provided.

Experimental Evaluation
is article uses the vehicle historical trajectory data to conduct experiments and analysis on the proposed edge resource allocation and selection method, and verify its efficiency based on the proposed vehicle trajectory prediction.

Visualization of Vehicle Historical Trajectory Data.
In this article, the trajectory data are extracted from the vehicle trajectory data in Chengdu in November 2016 applied by the Didi platform. e data format of each trajectory record information in the dataset is a four-tuple (id, Tstamp, longitude, and latitude) while recording a GPS vehicle trajectory positioning data every 3 s. Among them, id represents the order ID and Tstamp denotes the timestamp. e records of order trajectories are first preprocessed, and the boundary is then determined. e final trajectory map is presented in Figure 4, which represents the trajectory of each vehicle from the beginning of the order to its end. e line represents the trajectory of the vehicle, and the dot denotes the intervals of approximately 1 min. By observing a part of the records, it is deduced that the vehicle trajectory has some common characteristics. For instance, a slow vehicle is more likely to change direction and turn around, while a fast vehicle is more likely to maintain a relatively straight driving state. Based on these trajectory characteristics, geometric methods are used to predict the future trajectory with reference to the historical trajectory of the vehicle. It can be seen from Figure 5 that the traffic flow presents the characteristics of cycle similarity, correlation, and selforganization. It can be deduced that the traffic flow of one period of the day is related to the traffic flow of the same period of the previous week, which is also the same period of Input: Distance threshold r d Output: offloading decision r d (1) Initialize x d � 0, i � 1 (2) Get connectable RSU set RSUs � e 1 , e 2 , . . . , e L according to r d , sort by distance in ascending order S R � e 1 , e 2 , . . . , e L , where L is the number of S R (3) while x d � 0 do: (4) select e i ∈ S R : (5) if (6) x d � i, break (7) else i + + (8) else (9) else while (10) if x d � 0: (11) Offload to Cloud in cell n (12) end if ALGORITHM 2: Dynamic selection for vehicles to edge servers. Complexity the previous day. It is convenient to predict the future traffic flow, based on the historical data of the traffic flow, to a certain extent.

Experimental Results Based on the Traffic Flow Prediction
Model. e traffic flow prediction method proposed in 5.3 is used to predict the traffic flow in the last three weeks of November and compare it with the actual traffic flow statistics. In order to verify the accuracy of the developed traffic flow prediction model, the mean absolute percentage error (MAPE) (cf. equation (31)) and the root mean square error (RMSE) (cf. equation (32)) are used to calculate the actual error of the prediction model: e obtained results are shown in Table 3.

Experimental Evaluation Method of Edge Server Load.
In order to verify the efficiency of the proposed edge resource allocation method (Method-3) based on vehicle historical trajectory data, two benchmark methods are used for comparison: based on the vehicle's current motion state (speed, direction, and position), it predicts which cluster the vehicle is most likely to appear in the next period, and configure edge resources for the cluster in advance

Experimental Process.
is experiment first preprocesses the historical trajectory data of vehicle orders from the Chengdu Didi platform in November 2016. is includes the normalizing data values that are conducive to simplifying the input, deleting the orders with too short time, and supplementing the missing values in the middle. e total number of periods in the time and the total number of cells in the space are then calculated. Afterwards, based on the vehicle density distribution obtained from the GPS trajectory data of the vehicle, RSU base stations are randomly set up at different locations in different cells. is article assumes that the network delay between the RSU and the vehicle is proportional to the physical distance, as in [7]. Due to the randomness of the distance between the RSU and the vehicle, a network delay is assigned according to the normal distribution: 0.05 ≤ N(0.05, 0.02) ≤ 0.25. e two benchmark methods and the proposed method are used to allocate optimal edge resource on each edge server, and then, the three methods are tested using the dynamic selection algorithm. Similarly, the network delay is distributed among other nodes according to the normal distribution, except that the Euclidean distance between each pair of RSUs is used as the transmission delay reference standard in the solution of request redirection. e detailed information about the dataset and the parameter settings are provided in Table 4. e experimental environment is presented in Table 5.
For each RSUs cluster, the expected number of covered vehicles and the total expected vehicle capacity are first calculated in the next period t +1, based on the GPS historical trajectory data. e number of requests for each passing vehicle in the next period t + 1 is then randomly generated according to normal distribution. Afterwards, the optimal value of edge resource in each RSU is predicted and adjusted using the request allocation algorithm and load balancing algorithm. Finally, the actual success rate of request offloading and blocking rate are calculated using the dynamic selection of surrounding RSU, based on the actual vehicular trajectory.

Experimental Results.
e experimental comparison results between the proposed method and the benchmark methods are shown in Figure 6, with 1169 edge servers. e complete configuration method achieves a 100% success rate of vehicle requesting edge resources. e method based on motion estimation achieves 83.86%, and ϵ of 0.01, 0.05, and 0.1 achieves success rates of 90.97%, 88.85%, and 87.61%, respectively. It can be seen from Figure 6 that the complete configuration method can achieve a rate of 100%; however, the rate is achieved based on the cost of more edge resources.
In order to study the relationship between the actual success rate and ϵ, experiments on continuous values of ϵ for different numbers of edge servers are performed. e obtained results are shown in Figure 7. It can be seen that, as ϵ decreases, all the values slowly increase, and the actual success rate of edge resource maintains more than 90% when ϵ ≤ 0.02. In addition, the increase in the number of edge servers alleviates the blocking rate of vehicle requests for edge resources to a certain extent, making the experimental results more averaged.
In order to further study the performance of the proposed edge resource allocation method, its stability is observed from the three perspectives of time, area, and number of vehicles.
In order to observe the experimental performance in different periods, vehicle trajectory data from the 17 th (the number of edge servers is 1169) are considered, and the actual request success rate in different periods is observed.
e experimental results are shown in Figure 8, where the horizontal line represents the average for the whole day, the value of which is 90.97%. It can be seen that the actual request success rate is low during the period of 17 : 00-19 : 00 and high during the period of 23 : 00-03 : 00 (+1). is is due to the fact that the increase in the number of vehicles and the complex traffic conditions during the peak traffic hours in the evening lead to a low efficiency of actual resource allocation. Moreover, the proposed method has a better performance than that of the reference method of motion estimation in terms of edge resource allocation. In addition, the actual edge resource request success rate gradually increases with the decrease in ϵ.
In order to observe the impact of different numbers of vehicles on the utilization performance of edge resources, a different number of vehicle subsets from the trajectory datasets on the 17 th are randomly selected.
e edge resources are allocated to different subsets, under the conditions of different values of ϵ with 1169 servers. e obtained results are shown in Figure 9.
Relationship Between Actual Success Rate and ∈ ∈-0.1 ∈-0.03 ∈-0.04 ∈-0.05 ∈-0.06 ∈-0.07 ∈-0.08 ∈-0.09 ∈-0.02 ∈-0.01      Figure 9 demonstrates that, with the increase in the user's predefined blocking rate, the actual success rate of different vehicle numbers gradually decreases. When ϵ � 0.01, the experimental results of different numbers are the most similar. In addition, with the increase in ϵ, the difference between the actual edge resource request success rates of different numbers gradually increases, and the experimental results are more unstable. is is mainly due to the high mobility of driving behavior, which is prone to the idle state of edge resources and the blocking of actual resource requests. As the value of ϵ decreases, the deviation caused by high mobility can be better alleviated.
In order to evaluate the performance of the proposed edge configuration method in different cells, 5 cells are randomly selected on the 17 th , and the relation between the actual request success rate and ϵ is assessed. e experimental results are shown in Figure 10.
It can be seen from Figure 10 that the results of the actual request success rate calculated from different cells are similar. In addition, they increase as the user predefined blocking rate decreases. is is mainly due to the difference in traffic flow in different cells. For cells with less traffic flow, such as the actual resource request success rate in cell 33, the success rate of the actual resource request greatly varies. Moreover, the traffic flow of cell 2 is relatively large, so that the calculation result is relatively stable and less affected by the change in the user's predefined blocking rate.
In order to study the influence of different numbers of edge servers on the utilization performance of edge resources, the proposed algorithm is implemented for different numbers of edge servers. e obtained results are shown in Figure 11. Figure 11 shows that the results of the success rate of edge resource requests calculated by different numbers of edge servers are similar. In addition, they increase with the increase in the number of edge servers. e proposed method proposed is better than the mobile estimation, in terms of the actual success rate of edge resource requests.
is is due to the fact that with the increase in the number of edge servers, the deployment is relatively balanced, which alleviates the prediction error caused by the high mobility of the driving, to a certain extent.  14 Complexity Based on the experimental results obtained from the three perspectives of user's predefined blocking rate, number of vehicles, and region, the proposed method allows to obtain the minimum number of cloud servers in different cells and alleviate the emergency needs of the request after failing to offload to edge devices.
In order to study the relationship between the number of servers in Cloudlet and different edge resource allocation methods, experiments on continuous values of ϵ under the condition of different numbers of edge servers are performed. e obtained results are shown in Figure 12. It can be seen that, as the value of ϵ decreases, the number of cloud servers slowly increases, and the average value becomes close to 11. e resultant value based on the motion estimation method is almost 12.
In order to evaluate the influence of different numbers of vehicles on the minimum number of servers in Cloudlet, subsets with different numbers of vehicles are randomly selected from the trajectory dataset on the 17th, and edge resource allocation is performed on different subsets under different values of ϵ and motion estimation methods, with 1169 servers. e obtained results are shown in Figure 13. Figure 13 shows that, as the number of vehicles increases, the minimum number of servers in Cloudlet that need to be deployed gradually increases, and the gap between the different values of ϵ gradually increases. is is mainly due to the fact that, when the value of ϵ is determined, the more the number of vehicles is, the more requests blocked are sent by vehicles, and the more cloud servers are required. e main conclusions are summarized based on the experimental results. (1) e motion estimation method seemingly fail to be applied in practice, and the method proposed in this article not only reduces the redundant cost of resource allocation but also approaches the result of the complete configuration method.

Conclusion
In this article, starting from the demand for edge resources of connected vehicles and the resource allocation benefits of the entire edge system, an edge resource allocation and selection method based on vehicle trajectory prediction is proposed. e proposed algorithm fully considers the high mobility of vehicles and the characteristics of historical traffic flow. It divides the minimum number of edge resources and deploys minimum servers in Cloudlet on the premise of not being higher than the user's predefined blocking rate. e experimental results demonstrate that the proposed algorithm has a high performance in edge resource utilization and a high prediction accuracy. e future work should include the upgrading of mobility estimation strategy and requests allocation method between on-board-unit, edge server and cloud. e upgrading of mobility estimation strategy is replacing the cell in which a connected vehicle most likely appear with k(k > 2) cells and weight these different cells. Another possible future goal consists in improving the success rate of edge resource requests using the proposed method by dynamic allocation of edge server clusters, according to the driving conditions of connected vehicles and the load conditions of edge servers.

Data Availability
Gaia Data Open is only open to universities and scientific research institutions in China. e dataset is for scientific research use only and is strictly prohibited from being disseminated or used by others. Data have been desensitized. Please contact gaia@didiglobal.com using your school/research institution email for further.

Conflicts of Interest
e authors declare that they have no conflicts of interest.