Distributed Cooperative Driving Strategy for Connected Automated Vehicles at Unsignalized Intersections Based on Monte Carlo Method

.


Introduction
Connected automated vehicles (CAVs) are vital components of the new generation of transportation systems [1][2][3], and CAV-based trafc control is an efective way to improve safety, efciency, and energy consumption [4].With the help of V2X technology, CAVs can share their real-time operational data and communicate with roadside infrastructure to better coordinate their overall movement in the intelligent connected environment [5,6].
Te optimization of trajectories for CAVs has been recognized as a practical approach to enhance the overall efciency of urban trafc systems [7].Over the past decade, extensive research has been conducted on trajectory control for CAVs.A variety of control strategies have been developed, including adaptive cruise control (ACC) [8], cooperative adaptive cruise control (CACC) [9,10], model predictive control (MPC) [11][12][13], and deep reinforcement learning (DRL) control [14][15][16][17], are developed to optimize the trajectories of CAVs.
Intersections are the main bottlenecks for urban trafc [18], and congestion there causes great socioeconomic losses and increases travel delays signifcantly [19,20].As an indispensable part of trafc control, intersection management will change from traditional trafc-light control to unsignalized autonomous intersection management (AIM) for better coordination [21][22][23].Te main task of AIM is to control CAVs cooperatively to pass through the confict areas of intersections safely and efciently [24].In recent years, researchers have found that the most critical factor of cooperative driving at unsignalized intersections is the passing order of CAVs [25,26], and there are two main types of cooperative driving strategies to determine the passing order: reservation-based and planning-based [24].
Reservation-based strategies use some heuristic rules to allocate the right-of-way for CAVs in a short period [27].Dresner and Stone proposed an AIM strategy that allocates space resources to vehicles on a frst-come, frst-served (FCFS) basis [28,29].Choi et al. extended reservationbased cooperative control to multilane intersections [30].Malikopoulos et al. proposed an optimal decentralized energy control framework for CAVs [31].Zhang and Cassandras extended the framework further to include all possible turns and considered a joint energy-time optimal solution [32].However, reservation-based strategies mainly follow the FCFS approach, and their performance is not good enough in many cases [24].
Planning-based strategies aim to fnd a globally optimal solution for CAVs by enumerating all possible passing orders [27], and most scholars formulate the problem as a mixed-integer linear programming optimization problem for minimizing the total trafc delay of the intersection [33,34].Li and Wang proposed the tree search method, the equivalent goal of which is to fnd the leaf node (passing order) corresponding to the optimal solution [25].Xu et al. proposed a Monte Carlo tree search (MCTS)-based strategy to fnd a well-performing passing order within a limited planning time [35].Zhang and Cassandras designed a dynamic resequencing scheme to optimize the passing order [36].Apart from these methods, graph theory has been employed to determine the optimal passing order for multiple CAVs [37,38].However, a signifcant portion of the existing literature concentrates primarily on the feasibility of a confict-free passing order solution or on deriving an optimal passing order through specifc methods, often overlooking the aspect of computational complexity.With more vehicles, fewer passing orders are explored within the planning time, which brings difculties for practical applications [39][40][41].
To address the above problem, we propose a new distributed cooperative driving strategy to maintain a good balance between performance and computation.Te key idea is to utilize the constrained planning time to investigate nodes that have the potential to yield the optimal solution.To this end, the MCTS algorithm incorporating heuristic rules is used to accelerate the search process, and the root parallelization of MCTS combined with the majority voting rule is applied to implement the distributed cooperation and explore more leaf nodes.We also present a task-area partition framework for task decomposition matched with the strategy.
Note that Xu et al. [35] pioneered a centralized MCTSbased cooperative driving strategy at unsignalized intersections in which a roadside controller gathers information from all incoming CAVs to calculate a wellperforming passing order.Inspired by that work, this paper aims to elucidate further how to explore and evaluate more passing orders in a distributed way, thereby augmenting the solution's efectiveness.Te main contribution is introducing a distributed cooperative strategy, which integrates root parallelization MCTS and the majority voting rule, into the proposed driving taskarea partition framework to determine a nearly globaloptimal passing order within the constraints of limited planning time.
Te paper is organized as follows: Section 2 details the problem, Section 3 presents the new strategy, Section 4 introduces the details of simulation implementations, and Section 5 validates the efectiveness of the proposed strategy via the results of simulation experiments.Finally, Section 6 concludes the paper.

Problem Statement
Te unsignalized six-lane intersection shown in Figure 1 involves three key concepts: (i) intersection physical area (IPA), (ii) control area (CA), and (iii) communication range (CR).Te area within the circle of radius R 2 is the IPA in which collisions might happen, CR is within the circle of radius R 1 , and CA is within the CR but outside the IPA.Te communication range is a logical concept wherein vehicles function as independent agents that communicate with each other and control themselves in real time.Te intersection can be occupied simultaneously by multiple vehicles for better travel efciency.
To simplify the problem, we make the following assumptions: (i) All vehicles are CAVs equipped with V2X communication devices and can share their real-time operational data (position, velocity, etc.) (ii) Lane changing is prohibited after entering the control area to ensure vehicle safety (iii) Tere are no communication time delays or package losses (iv) Vehicles move at a constant speed when passing through the intersection physical area Because all vehicles have satisfactory lane-keeping ability, we focus only on longitudinal vehicle control.As shown in Figure 2, the innermost (leftmost) lane in the entrance direction allows vehicles to turn left or go straight, the middle lane is for going straight only, and the outermost (rightmost) lane allows vehicles to turn right or go straight.
Te complexity of the unsignalized intersection control stems from the conficting natures of various trafc movements, which typically have three conficting modes: crossing, converging, and diverging.Te three modes delineate the confict relationships among trafc movements at intersection areas, exit lanes, and entrance lanes to prevent conficting vehicles from passing through intersections at the same time.Figure 2 also shows the spatial distribution of the confict points.According to the geometry of the intersection, the confict points are divided into 64 crossing ones, eight converging ones, and eight diverging ones.
After entering CA, each CAV is treated as an independent agent.CAVs calculate the passing orders based on the collected driving data, respectively, and the fnal uniform passing order is decided by all agents from the above calculated passing orders.Ten, agents perform corresponding trajectory planning and adjustments to avoid collisions in IPA.Te distributed cooperative strategy aims to minimize trafc delays while making the slightest acceleration adjustments at the unsignalized intersection.So, we have the following evaluation function:

Journal of Advanced Transportation
where i is the ith CAV entering CA, S is the set of all CAVs in CA currently, t min ,i represents the travel time spent by CAV i from entering CA to passing through IPA at the maximum speed, t actual,i denotes the actual travel time of CAV i , D is the intersection's total trafc delay, a i represents the number of acceleration adjustments required for CAV i to pass through the intersection safely (calculated for each 1 m/s 2 change in acceleration), A is the total number of required acceleration adjustments for all CAVs, and α 1 and α 2 are the weighing coefcients.
Recognizing that the efciency of trafc fow at intersections is predominantly infuenced by the passing order of the CAVs, we employ the schedule tree theory and frame the entire issue as a tree search problem, wherein each leaf node signifes a distinct passing order [24,25].We take the simple intersection scenario shown in Figure 3 as an example.Te passing order ABCD indicates the priorities of the four CAVs.If two CAVs have the same spatiotemporal confict point, such as CAV b and CAV c , then the one ranking lower (CAV c ) in the passing order adjusts its trajectory by slowing down to reach the confict point later than expected to avoid collision.However, two CAVs without confict, such as CAV a and CAV b , can pass through the intersection simultaneously.To calculate all acceleration adjustments required to avoid potential collisions, it is necessary to ensure that the lower-priority CAV takes account of any adjustments made by the higher-priority CAV.
Figure 4 shows a schematic of building the schedule tree for the scenario shown in Figure 3.All possible passing orders are generated as leaf nodes in the bottom layer of the schedule tree.
If a passing order is given, then the total trafc delay and the required collision-avoidance acceleration adjustments for all CAVs in the passing order can be derived directly from Algorithm 1.
In the requirement function, CAV i performs the confict analysis judgment in turn with CAVs with higher priority in the passing order based on Figure 2. If there is a conficting trajectory between the two CAVs and the expected time interval to reach the confict point is within the given threshold [35], CAV i makes the collision-avoidance acceleration adjustments, and the requirement function returns the Boolean value true.Te CAVs needing acceleration adjustments update their trajectories after calculating the required adjustments to ensure safe trajectory adjustments for subsequent CAVs in the passing order.Te time complexity of Algorithm 1 is O(n).

Methodology
Tis section proposes a task-area partition framework for cooperative driving task decomposition.Moreover, it presents a root parallelization MCTS method with the majority voting rule, implementing the distributed cooperation while exploring and evaluating more passing orders to accomplish the driving tasks.
To clearly articulate the diferences between the proposed distributed MCTS-based cooperative driving strategy (D-MCTS) and the existing classical centralized MCTS-based cooperative driving strategy (C-MCTS), Figure 5 demonstrates the methodological framework of the two MCTS-based cooperative driving strategies.

Task-Area Partition
Framework.Te proposed task-area partition framework decomposes the mission of cooperative driving into three main tasks: (i) vehicle information sharing, (ii) passing order optimization, and (iii) trajectory control.In Figure 1, the intersection functional area (communication range) is partitioned accordingly into four areas: observation area (OBA), optimization area (OPA), execution area (EXA), and intersection physical area (IPA), the ranges of which are d oba , d opa , d exa , and R 2 , respectively.In each area, CAVs are assigned the following diferent tasks: (1) First, in OBA, approaching CAVs share their realtime operational data (position coordinates, speed, acceleration, current lane, target lane, etc.) based on the V2X information interaction technology.(2) Ten, in OPA, root parallelization is applied to implement the distributed cooperation (i.e., each CAV calculates a nearly global-optimal passing order  Input: A passing order P Output: Te total acceleration adjustments A of the covered CAVs and their required acceleration adjustments a i , respectively (1) Initialize a i as 0 (2) for each i ∈ [1, length (P)] do (3) t i � actual_time (i) − min_time (i) [35] (4) adjustment_required � Requirement (i) (5) Te Requirement function determines whether CAV i needs to make the acceleration adjustment (6) while adjustment_required do (7) a i � acc_calculate (i) (8) for each j ∈ [i, length (P)] do (9) if lane i � � lane j then (10) a j + � a i (11) end if (12) end for (13) adjustment_required � Requirement (i) ( 14) end while (15) end for (16)  based on the MCTS algorithm with heuristic rules, and then, all CAVs apply the majority voting rule to determine the fnal uniform passing order) to specify the following driving behaviors of all CAVs.(3) Next, in EXA, each CAV carries out the corresponding trajectory planning and adjustments in real-time to meet the desired driving trajectory determined in task 2 and keeps intervehicle safety gaps to arrive at IPA on time.(4) Finally, in IPA, the driving behaviors of CAVs are locked, and no further trajectory adjustments are made.CAVs pass through the intersection (then become departing vehicles) and leave the intersection area safely.
Tis framework provides a new solution for designing multivehicle cooperative driving strategies by assigning sensing, decision, and control tasks to diferent task areas.

MCTS-Based Cooperative Driving
Strategy.Herein, we apply MCTS combined with heuristic rules to select leaf nodes that have the potential to be the corresponding optimal passing order [35].In MCTS, each node in the search tree is assigned a score equal to equation (3) of its corresponding passing order to evaluate its potential, and the MCTS algorithm uses these scores to determine which branch of the tree should be explored.
MCTS establishes a search tree iteratively.Taking the scenario in Figure 3 as an example, Figure 6 shows one iteration of the MCTS-based strategy, which includes four steps: selection, expansion, simulation, and backpropagation [42].

Selection.
We start at the root node and pick the highest-scoring child node recursively until reaching either the most urgent expandable node or a terminal state.Te score for traversing the tree in MCTS is defned as the following tree policy and is given by [43]: where G k is the score of child node k, C is a weighting parameter, t is the number of times the currently searched node was visited, and t k is the number of times that child node k was visited.An expandable node refers to one that is not a leaf node but has unvisited child nodes.Equation ( 4) is an attempt to balance exploration and exploitation.

Expansion.
We randomly select an unvisited child node of the selected most urgent expandable node as the new node to add to the tree unless the selected node is at a terminal state.

Simulation.
Te simulation policy is used to directly obtain a leaf node (complete passing order) based on the current new searched node (partial passing order) to evaluate its potential.Because the complete passing order generated by random sampling cannot help us to efectively evaluate the true potential of the current new searched node during the simulation, we add the following two heuristic rules to the simulation process for deciding which node (CAV) should be expanded: (i) For CAVs in the same lane, add the current leading CAV frst.(ii) For CAVs passing through the same confict point, add the one with the less desired arrival time frst.
We update the score of the current new searched node after simulation via the following four steps: (a) Calculate the weighted summation of total delay and acceleration adjustments j k of the partial passing order corresponding to the current new searched node.(b) Calculate the weighted summation of total delay and acceleration adjustments  j k of the complete passing order corresponding to the best leaf node of the current new searched node via simulation.(c) Normalize j k and  j k into [0,1] using where j k, max and j k, min are the maximum and minimum weighted summation of total delay and acceleration adjustments among the sibling nodes of node k, respectively.(d) Calculate the score G k of the current new node as where ω is a weighting parameter.

Backpropagation.
Te result of the simulation is backpropagated through the selected current new searched nodes to the root node for updating the scores of all parent nodes.During the establishment of the search tree, the current optimal passing order is updated dynamically and continuously.Once the computation budget is reached, the MCTS terminates and returns the current-optimal passing order.Te planned total trafc delay and required acceleration adjustments of the CAVs are determined using Algorithm 1, and the simulation process of MCTS is given by Algorithm 2.

Distributed Cooperative
Driving.Distributed cooperative driving at an unsignalized intersection can be achieved by running simulations simultaneously via multiple agents in parallel (MCTS parallelization), which allows the whole multiagent system to run more MCTS simulations, i.e., explore as many leaf nodes (passing orders) as possible to evaluate their potential within a limited computation budget [43].Tere are three main types of parallelization methods for MCTS: leaf, root, and tree [44].

6
Journal of Advanced Transportation Leaf parallelization is applied to improve simulation results and can be implemented in a distributed strategy.Tis method requires one agent to establish the search tree, while the other agents participate in the parallel simulations, which brings the problem of choosing a CAV to build the search tree.Root parallelization and tree parallelization provide a multiagent method in which each CAV can contribute to the overall strategy as an independent agent.In root parallelization, multiple independent trees are built by separate CAVs with no information communicated before unifying the results [45].Tree parallelization brings the problems of maintaining a tree among all CAVs and selecting one CAV as the tree holder.
We decided to use root parallelization, following the example of Kurzer et al., who used root parallelization MCTS successfully in cooperative multiagent system trajectory planning for automated vehicles [46].Each CAV acts as an independent agent in root parallelization to calculate its own MCTS solution.
Soejima et al. explored root parallelization in the computerized Go feld; they compared the strategy based on the majority voting rule versus the average voting rule and found that the former was superior [47].Tus, we use the following majority vote rule to unify the solutions calculated by all CAVs.Once a CAV determines a passing order, it votes for that passing order and shares the voting result with all other CAVs: where x is the operational data of all CAVs, h j (x) represents the current optimal passing order calculated by CAV i based on the proposed MCTS method, I(•) denotes an indicator function, p is the possible candidate passing order calculated by all CAVs, and V(x) is the fnal passing order.Note that each CAV executes the uniform passing order with the most votes.However, if two or more passing orders receive the same number of votes, we compare the objective values (3), and the passing order with the lower value is selected as the current optimal passing order.

Trajectory Control.
After determining the passing order, the required acceleration adjustments and the desired arrival times to all confict points are also determined.We must optimize the acceleration control of the CAVs to enable them to reach the confict points at the desired times for passing through the unsignalized intersection safely and efciently.For the longitudinal dynamics of CAVs in the same lane, we use a microscopic car-following model known as the intelligent driver model (IDM) [48].Te car-following model considers both the tendency to accelerate in free fow and decelerate to avoid colliding with the preceding vehicle.In the IDM model, the acceleration u i of vehicle i is calculated by Input: Operational data of all CAVs Output: A possible passing order (1) Choose the uncovered leading CAV of each lane as the candidate CAVs and calculate their arrival times to all confict points on their desired trajectories.(2) Add the CAV whose arrival times to all confict points (compared to other candidate CAVs) are least into the passing order.If not, randomly select one.(3) Repeat steps 1 and 2 until a complete passing order is generated.(4) Te objective value (3) of the passing order  j k can be derived by Algorithm 1.

Journal of Advanced Transportation
where v i is the speed of vehicle i, v (i) 0 is the desired speed, s i is the actual gap (distance to the preceding vehicle), Δv i is the speed diference from the preceding vehicle, u max is the maximum acceleration coefcient, δ is the acceleration exponent, and s * is the function for calculating the desired minimum gap, i.e., where s (i) 0 and s (i) 1 are two distinct jam distances for vehicle i, T (i) is the safe time headway, and b i is the desired deceleration coefcient.
When CAVs pass through the OPA into the EXA, the CAV required to give way performs the acceleration trajectory adjustments based on the solution of the optimization problem proposed in [49]; it performs the control in real time and keeps intervehicle safety gaps to arrive at the IPA on time.

Simulation Process
To verify the efectiveness of the proposed distributed MCTS-based cooperative driving strategy (D-MCTS), we consider the typical four-way, six-lane unsignalized intersection shown in Figure 1.We establish a SUMO (simulation of urban mobility) simulation environment and conduct simulation tests to compare our strategy with some existing classical intersection management strategies.Te CR of the intersection is set to 200 m, which is within the efective communication of dedicated short-range communication technology (DSRC) [50].Te main parameters used in the simulation and the controller are given in Table 1 [51].
In the simulation, the trafc fow for CAVs is organized as follows: For the leftmost lane in each direction, 50% of the CAVs are programmed to turn left, while the remaining 50% proceed straight ahead.Similarly, half of the CAVs are designated to turn right in the rightmost lane, with the other half continuing straight.CAVs occupying the middle lane are exclusively allowed to travel straight.Te arrival of approaching CAVs at the intersection is modeled as a Poisson process.Tese vehicles are assumed to enter each lane of the intersection entrances evenly, each at an initial speed of 10 m/s.To evaluate the efcacy of the proposed strategy under varying trafc conditions, the overall vehicle arrival rate is varied from 0 to 2 veh/s, equivalent to 0 to 600 veh/(lane * h).
When CAVs enter the control area, they share their operational information and use the no-confict D-MCTS algorithms to coordinate their movements to pass through the intersection safely and in an orderly manner.Te control algorithms are all executed in Python 3.8 and interact with SUMO through the Traci interface.In the present study, we reschedule the passing order of all CAVs within the control area at 2-second intervals.
To determine the optimal parameter settings for the D-MCTS cooperative driving strategy, this paper compares the total trafc delay D of the given n CAVs with the FCFS strategy.We defne the decline rate of total trafc delay as η: where D FCFS and D D MCTS are the total trafc delays of the FCFS based and D-MCTS based strategies, respectively.First, to better understand the performance of the D-MCTS-based cooperative driving strategy under various trafc demands, we vary the vehicle arrival rate at the unsignalized intersection shown in Figure 1 to generate a variety of simulation scenarios and fx the computation time at 0.05 s.Te weighing coefcients α 1 and α 2 of equation ( 3) are selected for sensitivity analysis under a moderate vehicle arrival rate of 400 veh/(lane * h).Te experiments are conducted iteratively until the optimal driving strategy is identifed, which maximizes the decline rate η, specifcally at values of α 1 � 0.7 and α 2 � 0.3.Ten, we vary ω and C from 0 to 1. Figure 7 shows the decline rates η of the D-MCTS-based strategy with 20 CAVs.Tis scenario is further investigated with diferent numbers of CAVs, and the fndings are all consistent.Te results show that despite the poor parameter settings, the D-MCTS strategy signifcantly improves results.Te parameters ω and C are not particularly critical.However, they can still impact the balance of exploitation and exploration because we employ the heuristic rules in the simulation step to lessen the impact of random sampling.In certain cases, a larger value of C results in worse results because the agent has wasted too much processing time examining pointless nodes.However, some exploration is necessary because the decline rates with C � 0.25 are better than those with C � 0. Terefore, we set ω � 0.8 and C � 0.25 in the rest of the experiments to maintain a good trade-of between exploitation and exploration.
To further determine the maximum computation time, we consider the relationship between the decline rate of total trafc delay and the number of searched nodes.For this  7, and we change the arrival rates of the CAVs to generate a variety of driving scenarios with different numbers of CAVs, as well as recording the related decline rate and the number of searched nodes.Figure 8 shows that the decline rate increases dramatically when the number of searched nodes increases from 1 to 400, after which it saturates.Terefore, the proposed D-MCTS strategy can give a sufciently good passing order by searching 400 possible nodes for the considered scenarios.Generally, the decline rate increases gradually as the number of nodes searched by agents increases, and the more CAVs in the control area, the higher the decline rate for the same number of searched nodes.Note that the decline rates for scenarios with a small number of CAVs (30 CAVs) are low because the FCFS rule performs efectively in these straightforward scenarios.However, in situations with a larger number of CAVs (150 CAVs), there is not enough road space to adjust the vehicle passing order, and the decline rate is relatively small.For most intersection scenarios, 400 nodes can be searched within 0.05 s on our experimental device with an Intel i7 CPU and 16 GB RAM.For the following experiments, we set the maximum search time as 0.1 s to avoid errors caused by measurement and communication delays, and it is small enough for practical use.
To further delineate the diference between the FCFS strategy, the classical centralized MCTS-based strategy (C-MCTS), and new proposed distributed MCTS-based strategy (D-MCTS), we study the established unsignalized intersection scenario with 20 vehicles.We calculate the objective values (3) of each strategy's optimal solution (passing order); see Table 2. Te results indicate that the solution derived via the D-MCTS strategy closely aligns with the global-optimal solution obtained through the enumeration-based strategy and outperforms the solution from the C-MCTS strategy.Notably, the computational time required for the two MCTS-based strategies is substantially lower.While the FCFS strategy exhibits the shortest computation time, its solution signifcantly diverges from the optimal.Remarkably, the solution achieved by the D-MCTS strategy is ranked 190th among nearly 10 billion possible solutions, in stark contrast to the FCFS strategy's solution, which is ranked 3948842573rd.

Results and Analysis
In this section, we evaluate the performance of our newly proposed D-MCTS strategy compared to existing classical intersection management strategies under various vehicle arrival rates.Tese include the C-MCTS strategy, the FCFS strategy, the longest-queue-frst (LQF) strategy, the actuated intersection control (AIC) strategy, and the traditional signal control strategy.All strategies' infow trafc and time horizons are identical to ensure a fair comparison.Figure 9 illustrates the results, showcasing the average delay comparison across diferent arrival rates.To mitigate the efect of randomness in the outcomes and robustly compare the efectiveness of the strategies, simulations were conducted 50 times for each arrival rate scenario.In addition, Figure 9 includes the standard deviation of the average delay from the 50 simulation runs.
As depicted in Figure 9, there is a noticeable variation in intersection delays among the control strategies under the identical arrival-rate scenario.It is evident that as the number of CAVs at the intersection escalates, the D-MCTS strategy consistently exhibits the lowest delay among the six strategies, resulting in higher travel speed and throughput.Specifcally, under a high trafc demand scenario with an arrival rate of 2 veh/s, the D-MCTS strategy (average delay: 17.1 s) outperforms the C-MCTS strategy by 1.7 s, the LQF strategy by 4.6 s, the AIC strategy by 5.6 s, the signal control strategy by 13.2 s, and the FCFS strategy by 28.2 s.Tis translates to improvements in trafc delay of 9%, 21.2%, 24.7%, 43.6%, and 62.3%, respectively.Moreover, when the arrival rate exceeds 1 veh/s, the FCFS strategy is most affected by changing trafc conditions due to its reliance on the arrival time of CAVs for priority assignment.Furthermore, the standard deviations indicate that the FCFS strategy is less efective than the two MCTS-based strategies in managing high arrival rates.Both MCTS strategies efciently handle increased trafc density by balancing exploration and exploitation.
From a system optimization standpoint, the LQF strategy is often employed in intersection management, particularly within adaptive signal control systems.Tis strategy prioritizes longer queues at specifc time points, which appears logical.However, this approach might not be the most efective in long-term scenarios.It also becomes apparent that the traditional signal control strategy does not excel in the comanagement of CAVs at intersections.A primary factor contributing to the heightened delays under the signal control strategy is its inefcient utilization of intersection space and time resources, a phenomenon observable even at relatively low trafc volumes.In addition, Figure 9 highlights a notable feature of the signal control strategy: its relatively small standard deviation.Tis indicates that signal control performance is largely unafected by the randomness in trafc demand.Consequently, in certain situations, the signal control strategy may surpass the FCFS strategy in efectiveness.
Figure 10(a) illustrates the correlation between the average speed and arrival rate across the simulation area, employing six distinct control strategies.Notably, when the arrival rate is below 0.5 veh/s, the FCFS, LQF, C-MCTS, and D-MCTS strategies can maintain the average speed slightly above the initial speed.However, as the arrival rate increases, a decline in average speed is observed.In this context, the D-MCTS strategy demonstrates a signifcant improvement compared to the other three strategies, with only a 35% reduction in average speed at an arrival rate of 2 veh/s.Furthermore, the AIC strategy exhibits superior performance over the traditional signal control strategy.Tis is attributed to the AIC strategy's ability to dynamically adjust trafc signal phases based on the density of vehicles at each entrance.
Figure 10(b) displays the average waiting times experienced under diferent control strategies at various arrival rates.Notably, CAVs managed by the D-MCTS strategy experience minimal waiting time when the arrival rate is below 1 veh/s.Furthermore, even as the arrival rate increases, the waiting time for CAVs under the D-MCTS strategy remains relatively low.In contrast, the FCFS strategy exhibits a substantial increase in waiting time as the arrival rate surpasses 0.5 veh/s.For the LQF strategy, it is observed that under high trafc demand, the LQF's passing pattern tends to resemble that of signal-controlled trafc, resulting in an average waiting time similar to that observed under the AIC strategy.Both the AIC and traditional signal control strategies demonstrate superior performance compared to the FCFS strategy when the arrival rate exceeds 1.25 veh/s.
Figure 10(c) clearly demonstrates that, in scenarios with the increased vehicular presence at the intersection, the D-MCTS strategy necessitates slighter acceleration adjustments compared to the FCFS and C-MCTS strategies.Tis characteristic signifcantly enhances the operational stability  of CAVs under the D-MCTS strategy.Moreover, the reduced need for acceleration adjustments improves passenger comfort within the intersection.In conditions where the arrival rate reaches 2 veh/s, resulting in higher intersection congestion, the LQF strategy exhibits performance comparable to the D-MCTS strategy and markedly superior to the other two strategies mentioned above.In addition, it is observed that both signal-control strategies consistently require fewer acceleration adjustments, indicating that CAVs undergo less frequent acceleration and deceleration under signal-controlled modes compared to those managed by unsignalized intersection control strategies.
In Figure 10(d), we examine the environmental implications of CAVs navigating the intersection under various control strategies, focusing on the average CO 2 emissions.Tese emissions are quantifed using the default emission function in the SUMO tool.It is observed that as trafc density intensifes, the LQF strategy increasingly demonstrates its efectiveness in reducing carbon emissions, closely followed by the D-MCTS strategy.Conversely, the CO 2 emissions escalate rapidly under the FCFS strategy and remain elevated under both signal control strategies.Tis trend suggests that frequent starting and stopping, regular acceleration and deceleration, and prolonged waiting times substantially increase carbon dioxide emissions, thereby exacerbating environmental pollution.
In addition, we analyze the impact of diferent control strategies on trafc throughput across varying trafc demand scenarios.A 100-minute trafc simulation was conducted for each arrival rate, with the comparative results detailed in Table 3.It is evident from these results that the proposed D-MCTS strategy signifcantly enhances trafc   throughput across all tested scenarios.As previously discussed, this improvement can be attributed to our distributed MCTS-based cooperative driving framework, which can explore and evaluate a broader range of CAV passing orders within the given planning time constraints compared to the centralized one.
As the number of CAVs at the unsignalized intersection increases, the performance of the FCFS strategy in terms of cooperation diminishes.However, the proposed D-MCTS strategy can always fnd a nearly global-optimal passing order regardless of the number of CAVs.Consequently, although the total delay inevitably escalates with an increasing number of CAVs, the D-MCTS strategy demonstrates a more pronounced capacity for improving the above key performance indicators.
To assess the efectiveness of the D-MCTS strategy in terms of driving stability, simulations were conducted under varying arrival rates.Tese simulations focused on monitoring the speed fuctuations of CAVs compared with the C-MCTS strategy at diferent cross sections of the intersection area.Induction loop detectors were strategically placed at distances of 50 m, 100 m, 150 m, and 200 m from the intersection center on each entrance lane of the east-west main road within the SUMO simulator.Tese detectors recorded the average speed and standard deviation of all CAVs passing through these cross sections.As presented in Figure 11, the fndings indicate a superior performance of the D-MCTS strategy over the C-MCTS strategy across various trafc demand scenarios.Notably, the D-MCTS strategy ensures smoother operation speeds at diferent cross sections, leading to reduced speed volatility, enhanced operational stability, and quicker passage through the unsignalized intersection.Tis results in an overall improvement in passenger comfort.

Conclusions
Tis study proposes a distributed MCTS-based cooperative driving strategy for CAVs at unsignalized intersections called D-MCTS.It integrates root parallelization MCTS with the majority voting rule to implement the distributed cooperation, aiming to explore and evaluate the feasible vehicle passing orders as many as possible within the limited planning time to fnd a nearly global-optimal passing order enabling CAVs to minimize trafc delays while making the slightest acceleration adjustments.In addition, the research develops a task-area partition framework to decompose the mission of cooperative driving into three main tasks: vehicle information sharing, passing order optimization, and trajectory control.
In a comparative analysis conducted within the SUMO simulation environment, the efcacy of the proposed D-MCTS strategy was validated against fve other driving strategies.During heavy trafc fow, the D-MCTS strategy reduced the average delay for CAVs by 9.0% to 62.3%.Furthermore, there was only about a 35% decrease in average speed, and the average waiting time remained minimal.Notably, the average number of acceleration adjustments was lower than that of the FCFS, LQF, and C-MCTS strategies, indicating a signifcant improvement in CAV operational stability.Regarding carbon emissions, the D-MCTS strategy was outperformed only by the LQF strategy.In addition, the D-MCTS strategy efectively enhanced trafc throughput across all scenarios.Te simulation results demonstrate that the new D-MCTS strategy noticeably improved efciency, safety, comfort, and emissions.
However, the present study only considered a single scenario of a two-way, three-lane unsignalized intersection in a pure CAV environment, for which the trafc environment was limited.Future work should consider more complex trafc scenarios, such as multiple unsignalized intersection networks at diferent CAV penetration rates, which would lead to more fndings on how the proposed distributed cooperative driving strategy afects mixed trafc streams.

Figure 3 :
Figure 3: An intersection scenario with four vehicles.

Figure 4 :
Figure 4: Te schedule tree stemmed from the intersection scenario shown in Figure 3.

ALGORITHM 1 :Figure 5 :
Figure 5: Methodological framework of the two MCTS-based cooperative driving strategies.(a) Centralized MCTS-based cooperative driving strategy.(b) Distributed MCTS-based cooperative driving strategy.

ALGORITHM 2 :Figure 6 :
Figure 6: One iteration of the MCTS-based cooperative driving strategy.

Figure 8 :Figure 7 :
Figure 8: Te result of the decline rate η with respect to the number of searched nodes for the intersection shown in Figure 1.

Figure 9 :
Figure 9: Average delay of diferent driving strategies under diferent vehicle arrival rates.

Figure 10 :
Figure 10: Comparison results of signifcant trafc indicators of diferent driving strategies under diferent vehicle arrival rates.(a) Average speed.(b) Average waiting time.(c) Average number of adjustments.(d) Average CO 2 emission.
experiment, we choose the ideal parameter combination shown in Figure

Table 2 :
Solution values of diferent cooperative driving strategies.