A Discretionary Lane-Changing Decision-Making Mechanism Incorporating Drivers’ Heterogeneity: A Signalling Game-Based Approach

&is paper attempts to propose a discretionary lane-changing decision-making model based on signalling game in the context of mixed traffic flow of autonomous and regular vehicles. &e effects of the heterogeneity among different drivers and the endogeneity of same drivers in lane-changing behaviours, e.g., aggressive or conservative, are incorporated through the specification of different payoff functions under different scenarios. &e model is calibrated and validated using the NGSIM dataset with a bilevel calibration framework, including two kinds of methods, genetic algorithm and perfect Bayesian equilibrium. Comparative results based on simulation show that the signalling game-based model outperforms the traditional space-based lane-changing model in the sense that the proposed model yields relatively stable reciprocal of time to collision and higher success rate of lane-changing under different traffic densities. Finally, a sensitivity analysis is performed to test the robustness of the proposed model, which indicates that the signalling game-based model is stable to the varying ratios of driver type.


Introduction
Lane-changing behaviour is vital in its effects on traffic flow. e behaviour that drivers execute lane-changing to obtain a better driving condition (higher speed, lager space, etc.) or get to the correct lane is a complicated decision-making process which is affected by many observable and unobservable factors of the surrounding environment and drivers. A bad lane-changing decision can lead to serious traffic problems [1], such as traffic breakdowns, bottleneck discharge rate reduction, stop-and-go oscillations, and safety hazards [2], while cooperative lane-changing may potentially improve traffic situation.
Compared to traditional cars, connected and autonomous vehicles enabled in vehicle-to-vehicle and vehicle-toinfrastructure communication technology are promising in improving traffic efficiency and safety [3]. However, one can foresee that within a certain period of time, regular vehicles and autonomous vehicles coexist on the road [4]. is means a certain level of cooperation is needed between human drivers and autonomous vehicles. Autonomous vehicles need to respond correctly and accurately to the environment and with respect to the specific drivers.
Existing studies on lane-changing of autonomous vehicles are mostly focused on the development of lanechanging trajectory, design of lane-changing controller, and prediction of lane-changing intention, while the lanechanging decision-making of autonomous vehicles has not been addressed sufficiently, especially in the interactions and heterogeneity between different drivers, which is an important issue in lane-changing process. To fill this research gap, this paper proposes a lane-changing decision-making mechanism for autonomous vehicles in the context of mixed traffic flow with conventional vehicles. Drivers' heterogeneous lane-changing behaviour is integrated based on a signalling game approach, whereas different payoff functions are established according to the type of drivers, aggressive or conservative. e model is calibrated and validated using NGSIM data. Furthermore, sensitivity analysis is performed through microsimulations to examine the model robustness in comparison with the traditional space-based model. e remainder of this paper is organized as follows: Section 2 reviews existing lane-changing approaches. Section 3 introduces the methodology that is proposed in this paper. Data processing and model calibration and validation are presented in Sections 4 and 5, respectively. Sections 6 and 7 present the details of simulation process and main findings. e paper is summarized in Section 8 with a discussion of future possible research aspects.

Literature Review
In the literature of lane-changing research, three types of models dominate the modelling of lane-changing decisions: rule-based models, utility-based models, and game theorybased models [2]. e pioneer rule-based model proposed by Gipps [5] contains a couple of factors influencing on lane-changing behaviour, such as collision avoiding, position of barrier, type of vehicles, special lane, and the possibility of advanced speed. When there are multiple lanes to change, a set of priority rules is used to decide the target lane. Although the involved affecting factors are reasonable and rather exhaustive, the interactions and communication between vehicles, e.g., considering other vehicles as moving obstacles, were ignored. Another limit is that this model fails to take heterogeneous drivers into account and assumes the characteristics of the same driver is constant over time (e.g., sometimes a conservative driver can do some aggressive actions for certain reasons). Based on Gipps's work, Yang and Koutsopoulos [6] made some improvement by splitting this lane-changing process into four steps: considering a lane-changing, determining the target lane, searching for the acceptable gap, and executing the change. However, no calibration and validation of this model had been done. Hidas [7] classified three types of lane-changing behaviour: free lane-changing, cooperative lane-changing, and forced lane-changing and proposed a new model which is capable to describe the interactions between vehicles and overcome one of flaws in Gipps's model.
Differing from rule-based models, utility-based models assume that the different levels of lane-changing behaviour depend on driver characteristics and status which vary across individual drivers [8]. Ahmed [9] expanded the model by incorporating mandatory lane-changing decisions. Furthermore, to address the possible dependency between discretionary lane-changing and mandatory lane-changing decisions, Toledo et al. [10] proposed an integrated model and introduced a random term to capture unobserved variables, which may explain the heterogeneity between different drivers and change in styles of the same driver over time.
e third approach in modelling lane-changing behaviour is game theory which provides a way to model the interactions between drivers. e first work focusing on the merging behaviour of regular vehicles was conducted by Kita [11]. After that, Meng et al. [12] used vehicles' velocity and acceleration to construct payoff functions without considering drivers' heterogeneity. Yu et al. [13] represented a Stackelberg game model to describe the lane-changing decision-making process of autonomous vehicles. e payoff function is defined as a linear combination of vehicle spacing and safety factor. In addition, the coefficients were considered as a function of driver's aggressiveness. e model has great advantages in comparison with the traditional rulebased model as it involves aggressiveness revealed by vehicle's lateral movement. Nevertheless, other psychological factors such as regret and conformity were considered. Ali et al. [2] proposed a game model under mandatory lanechanging scenarios in which the strategy set was screened by empirical data. Based on the NGSIM US101 data [14] and driving simulator data, the model was validated through predefined microscopic indicators, such as confusion matrix, time, and location error.
In addition to the models presented above, researchers have also applied artificial intelligence to model the lanechanging decision process, such as BP neural network [15] and fuzzy inference system [16]. It is no doubt that these models can generally achieve a better prediction, and the model performance largely depends on the quality and quantity of the mixed traffic flow data which are still difficult to obtain. Moreover, artificial intelligent models are often inexplainable models in the sense it could not help much to understand the mechanism of lane-changing behaviour.
Relative to the situation of human drivers only, future traffic condition will be mixed with autonomous vehicles, indicating a variety of interactions between human drivers and autonomous vehicles. Several issues that differ from the traditional traffic environment may arise. First, autonomous vehicles receive massive information about surroundings (i.e., real-time traffic information and vehicle-to-vehicle communication) through various sensors or cameras, which indicates that an optimal lane-changing decision can be made relatively easier than human drivers if the information can be correctly and sufficiently recognized and used. is means when modelling the lane-changing behaviour of autonomous vehicles, additional assumptions may be needed. Moreover, the lane-changing decision of autonomous vehicles plays a vital part in the coordination of traffic system. Human drivers use their experience to decide whether to change lanes, while autonomous vehicles need to learn to interact with human drivers instead of following fixed rules or being constantly courteous. Autonomous vehicles need a higher degree of accuracy and effectiveness in responses than human drivers to cope with different styles of driving and lane-changing behaviour.
Recent studies have attempted to differentiated drivers from the perspective of discretionary lane-changing behaviour and specified the payoff functions based on safety and efficiency. Depending on a group of interviewers' subjective opinion or using clustering algorithms, different driving styles such as aggressive or not aggressive can be identified [17][18][19]. e effects of the heterogeneity among different drivers in their lane-changing behaviour on other vehicles are critical, especially in a connected environment and with autonomous vehicles. However, this heterogeneity may be insufficient to reflect the variation of decisions of a same driver. Research has shown that drivers typically adjust their driving styles in different traffic situations [1] such that the same driver may have different lane-changing styles in different temporal and spatial situations [20,21].
is signifies the importance and necessity that, when building a reasonable lane-changing decision-making mechanism for autonomous vehicles in a mixed traffic flow, one should systematically incorporate the interactions and heterogeneity between different drivers. First, discretionary lane-changing is not a one-way decision-making process, which needs interactions with others. In a lane-changing scenario, decisions of two participants are affected by each other. However, such an interaction has been ignored by all existing approaches except the game theory. Second, in addition to the heterogeneity between different drivers, the changes in the characteristics of the same driver have not been paid attention. Decisions should be made by considering the real-time reaction of drivers rather than a fixed driving style of classification since drivers can act in an opposite way to normal in certain situations. erefore, in this paper, we attempt to incorporate the heterogeneous behaviour among different drivers and of the same driver into the lane-changing decision-making process. To the best of our knowledge, this is the first attempt to take into account the effects of both the heterogeneity between different drivers and the endogeneity of same drivers in lane-changing decisions. In order to mimic the dynamic feature of such decisions, a signalling game approach is proposed. In the following sections, we will present the methods and our simulation results.

Materials and Methods
Here, we regard a lane-changing decision-making process as a game considering that game theories are potentially useful to the modelling of lane-changing behaviour. In the following sections, we first present the concept of game theory and the signalling game that is adopted in this study. en we present the specification of games in lane-changing behaviour in the setting of mixed traffic flow of autonomous and human driver cars.

Game eory and Signalling Games.
Game theory is a powerful and well-developed mathematic tool. It is usually used to model the interaction process between two or more players in economic fields. With a number of disciplines, game theory has been widely used in logical decision-making in humans. It has six main concepts: game, player, strategy, payoff, information set, and equilibrium. Game is defined as any circumstances that are affected by players' decision, and a player is seen as a decision-maker in this game. Strategy is a decision that a player may take in that circumstance. Payoff is used to measure the cost due to a pair of decisions. Information set consists all available information. Equilibrium is the optimal state of a game, yielding every player's decision and their payoffs.
Solutions under equilibrium in game theory can be classified in four types by the nature of timing and information. In the case of simultaneous move games with complete information, the appropriate solution concept is Nash equilibrium (NE), while for games of sequential timing with complete information, the best solution is subgame perfect equilibrium (SPE). When there is incomplete information, Bayesian Nash equilibrium (BNE) and perfect Bayesian equilibrium (PBE) are the solutions to solve simultaneous move and sequential timing games, respectively. It is a set of strategies and beliefs such that the strategies are sequentially rational given the players' beliefs and players update beliefs via Bayes rule wherever possible [22].
Among the various games, signalling game is a simple type of dynamic Bayesian game in game theory [23]. It includes two players, called sender and receiver, and a party called Nature who randomly decides the type of a sender for a receiver. e receiver does not know the type of senders for sure but knows about the probability of the sender type. For example, assuming there are two potential types, strong and weak. Even if the Nature decides that the sender is a weak type, the receiver only infers the type of the sender by observing its actions, e.g., the receiver has information that the probability of being weak is 0.4 and 0.6 of being strong.
Since lane-changing behaviour is sequential (one takes actions according to the other's actions) and the information may be incomplete (the type of vehicles on the target lane is unknown to the vehicle who wants to change lanes), we apply signalling games based on PBE in a lane-changing process.

Signalling Game of Lane-Changing.
In the process of lane-changing decision with mixed traffic flow, autonomous (CAR-E) and regular vehicles (CAR-TF) are the two players. When a CAR-E has a demand of changing lanes, it indicates its intention to the other vehicle, CAR-TF. CAR-TF will respond to the intention by sending a signal as soon as the intention is received. Note that sending a signal could be generally regarded as a warning behaviour of drivers, such as a significant acceleration or horning. However, an aggressive driver may have a higher probability of sending a signal than other types of drivers. Observing the reaction of CAR-TF, CAR-E can then analyse the behaviour of the CAR-TF and calculate the payoff values in order to make an optimal decision.
Due to the fact that the type of drivers in CAR-TF, e.g., aggressive or conservative, which largely affects lanechanging decision of CAR-E is unknown to CAR-E, we assume the probability that CAR-E knows CAR-TF as an aggressive driver is p. CAR-TF knows that this is the belief of CAR-E, and CAR-E knows that CAR-TF knows that this is the belief of CAR-E, which means p is the common knowledge between both players [24].
Here, we defined that the set of CAR-TF types is Θ� θ 1 , θ 2 , where θ 1 and θ 2 represent the aggressive type and conservative type, respectively. e prior probability of the one type is p, then the prior probability of being the other type is 1 − p(0 ≤ p ≤ 1) Let the signal set of CAR-TF be where S and NS denote signalling and not signalling, respectively. e action set of CAR-E is defined as where C denotes lane-changing behaviour and W represents waiting decision. For each signal that CAR-E receives and the possible action that each CAR-TF takes, there is a payoff for CAR-TF (P ij ) and CAR-E (Q ij ). All sets of possible strategies of this game are presented in Table 1.
In the game sequence, "Nature" selects the type of CAR-TF, and CAR-E knows the prior probability of type θ i , p(θ i ). Note that "Nature" is virtual and not participating in the game. When the CAR-TF receives the intention of CAR-E, it chooses an action a 1 ∈ A 1 (usually is related to the driver's types). en, CAR-E observes the action of CAR-TF, a 1 , then infers the posterior probability p(θ | a 1 ) using Bayes' rules. en CAR-E takes an action, a 2 ∈ A 2 . Considering the multiple possibilities at each stage of the sequence, a game tree can be built to describe the decision mechanism, as shown in Figure 1.
In Figure 1, Ct is the cost of CAR-TF for sending a signal, such as decreased attention and acceleration fluctuation. F A TF and F A E are losses of CAR-TF and CAR-E, respectively, in the situation that CAR-E changes lanes when the driver of CAR-TF is aggressive. G C TF and G C E are losses of CAR-TF and the benefit of CAR-E, respectively, in the situation that CAR-E changes lane when the driver of CAR-TF is conservative. R A TF and R A E are regrets of aggressive and conservative CAR-TFs in the situation that CAR-E changes lanes when the driver of CAR-TF did not send a signal. Figure 2 shows an example of the lane-changing process where CAR-E is assumed to change lanes. E represents CAR-E, and TF represents CAR-TF. TL and L1 are the leading vehicles in the target lane and current lane, respectively, called CAR-TL and CAR-L1.
Considering that speed and driving space are the two main factors influencing lane-changing behaviour, we include the space and speed into the payoff function. In addition, an indicator of regret is also taken into account to mimic individuals' decision-making.
Let t 0 be the starting time of lane-changing and Δt be the time to finish this process. en, the distance between vehicles can be calculated as below: where, D 1 , D 1 ′ , and D 2 denote the distance between CAR-E and CAR-L1, CAR-E and CAR-TL, and CAR-E and CAR-TF at t 0 , respectively; L denotes the length of vehicles. y 0 L1 , y 0 E , y 0 TL , and y 0 TF denote the longitudinal position of the centre of CAR-L1, CAR-E, CAR-TL, and CAR-TF at t 0 respectively. us, the distance between CAR-E and CAR-TF at t 0 + Δt, D 2 ′ , is where v 0 TF and v 0 E are the instantaneous speed of CAR-TF and CAR-E at t 0 and a 0 TF and a 0 E are the acceleration rates of CAR-TF and CAR-E at t 0 .
Here, the regret of aggressive and conservative CAR-TFs are defined as functions of the speed difference between CAR-TF and CAR-E as below: where v TF ′ and v E ′ are the speed of CAR-TF and CAR-E at t 0 + Δt, respectively. c 1 and c 2 are the parameters to be estimated.
e cost of CAR-TF to send a signal is defined based on the difference in acceleration at different time, which indicates distraction caused by sending a signal, given by the following equation: where a TF ′ is the acceleration of CAR-TF at t 0 + Δt. ε is a parameter to be estimated. For aggressive CAR-TFs, the loss is dependent on the difference in distance between the two vehicles at time t 0 and t 0 + Δt, which can be calculated by using the following equation: where f 1 TF is the parameter to be estimated. Let v TF ′ and v E ′ be the speed of CAR-TL and CAR-E at t 0 + Δt, respectively. G C TF is quantified by loss of space and speed; thus, the loss of conservative CAR-TFs can be calculated by using the following equation: where g 1 TF and g 2 TF are the parameters to be estimated. Similarly, the payoff functions of aggressive and conservative CAR-Es can be calculated as below: where η 1 denotes the weight coefficient of speed benefit in the payoff of CAR-E, which can be set in the controller manually. e higher η 1 is, the more the attention paid on speed is. Lower η 1 can be set in some risky road segments to create a more cautious driving style for autonomous vehicles. g 1 E , g 2 E , and f 1 E are the parameters to be estimated.

Equilibrium Setting and Model Solution.
According to the previous studies on Game eory, signalling game can be solved by the following method [23]. Let θ(θ ∈ Θ) be the type of CAR-TF, which is selected by the Nature. a 1 (a 1 ∈ A 1 ) and a 2 (a 2 ∈ A 2 ) denote the actions selected by CAR-TF and CAR-E. e payoffs of the CAR-TF and CAR-E are represented as u 1 (a 1 , a 2 , θ) and u 2 (a 1 , a 2 , θ). σ 1 (· | θ) is a probability distribution to each type of CAR-TF over action a 1 , and σ 2 (· | a 1 ) is a probability distribution to each action a 1 over action a 2 . In addition, when CAR-E updates the probability of CAR-TF's type by Bayesian rules, the strategies of CAR-TF and CAR-E become σ * 1 (· | θ) and σ 2 (· | a 1 ), respectively. Consequently, a perfect Bayesian equilibrium (PBE) for a signalling game is a strategy profile σ * and a posterior belief μ(· | a 1 ), which satisfy the following equation:  Journal of Advanced Transportation μ ·|a 1 is any probability distribution over Θ.
Note that condition (P 1 ) means that CAR-TF takes into account the influence of a 1 on CAR-E and condition (P 2 ) means that CAR-E makes the best response to the action of CAR-TF by knowing the posterior beliefs. Moreover, condition (B) is the application of the Bayesian rule.
According to different actions of players, the PBE of this game can be divided into three subcategories, namely, separating equilibrium, pooling equilibrium, and semiseparating equilibrium. As discussed below, these equilibriums are used to derive the full set of game solutions.
A separating equilibrium represents that different types of CAR-TFs select different actions with the probability of 1, i.e., actions reveal types definitively. For example, aggressive drivers always select to send a signal, while conservative drivers always choose to be silent (i.e., doing nothing). In this situation, once CAR-TFs take an action, CAR-E can distinguish its type correctly and make the optimal action. ere are two different possible situations: (1) all aggressive drivers send a signal, and all conservative drivers do nothing; (2) all aggressive drivers do nothing, and all conservative drivers send a signal.
In the first case, we assume all aggressive drivers send a signal, and all conservative ones do nothing. Firstly, let us consider payoffs from CAR-E's perspective. If CAR-E observes a signal, it believes the CAR-TF is an aggressive one based on the assumption. According to the payoff matrix in the game tree, it costs F A E (i.e., the payoff is − F A E ) to change lane, while that would be 0 in case of waiting decision. Apparently, 0 is larger than − F A E ; therefore, it tends to wait instead of changing a lane. On the contrary, if CAR-E does not observe any actions (i.e., CAR-TF does nothing), it will reckon that the driver is more conservative. If CAR-E changes lane, the payoff will be G C E , which is higher than the payoff of waiting (0). us, CAR-E is more likely to change lanes. Consequently, CAR-E's strategy can be concluded as below: To reach the equilibrium, payoffs of both sides are necessary to be considered. erefore, further analysis is conducted in terms of CAR-TF's interest. Several situations can be concluded in Table 2 according to corresponding branches of the game tree.
In order to meet the assumption made before, a set of inequality can be derived: erefore, the condition of this equilibrium is e same procedure can be performed in the second case.
A pooling equilibrium denotes that different types of CAR-TFs choose the same action, which makes CAR-Es unable to update the prior probability. In other words, no further information is included in the sender's choice. e only thing CAR-E can do is to predict the type according to p (the original probability).
ere are two situations: all drivers go for signalling or do nothing. Under the first circumstance, CAR-E's payoffs can be derived as below: Here, if payoff C > payoff W , then CAR-E will choose to change a lane. On the contrary, if payoff C < payoff W , then CAR-E will choose to wait for another chance.
In case that the first condition is matched, the payoffs of CAR-TF can be summarized in Table 3 based on its interest.
According to the discussion above, a set of inequality conditions can be derived: here comes the first strategy.
⎪ ⎪ ⎩ all CAR-TFs will send a signal and CAR-E will change lanes. e remaining branches can be derived by the same way. Semiseparating equilibrium is the most complicated situation. is means that some types of CAR-TFs randomly choose actions, and other types of CAR-TFs choose specific actions. A conservative driver is more likely to yield when a vehicle in the other lane shows lane-changing intention. However, he/she may also pretend to be an aggressive driver in order to protect his/her own interest. Consequently, in this study, semiseparation equilibrium can be divided into two categories. One is that a part of aggressive CAR-TFs choose to send a signal and the rest not, while the conservative ones do not take any actions. On the contrary, the other case is that a part of conservative CAR-TFs chooses to send a signal and the rest not, while all aggressive drivers choose to send a signal.
In the first case, when CAR-E receives a signal, it will determine CAR-TF is an aggressive type and choose to wait in order to get the optimal payoff. e other scenario is more complicated, when there is no signal, CAR-E will have to determine the best strategy according to the probability. Let x (x < 1) be the probability of the signalling action of aggressive drivers, i.e., P(S | aggressive) � x. Let y (y < 1) be the probability of the lane-changing action of autonomous vehicles when observing a signal, i.e., P(C | S) � y. en, the probability of CAR-E observing nothing is en, erefore, the payoffs of CAR-E under different strategies can be concluded below: x can be determined by equation payoff C � payoff W ; thus, According to CAR-TF's strategies, its payoffs can be summarized in Table 4.
Accordingly, y can be determined by aggressive drivers' payoff equation, payoff S � payoff NS ; thus, y � Ct e assumption is that all the conservative drivers do not take any actions; therefore, from their perspective, payoff S < payoff NS , i.e., In conclusion, when aggressive CAR-TFs choose to send a signal or not, while the conservative ones do nothing. According to the analysis presented above, the whole model solution can be summarized in Table 5.

Data
e NGSIM US-101 dataset is used to calibrate the proposed model.
is dataset, which was collected on Hollywood Freeway, Los Angeles, includes the detailed vehicle trajectories at every 0.1 second. Figure 3 shows the study area of the NGSIM US-101 dataset.
In order to implement the proposed model, the data were first cleaned according to the following principles: (1) Since the model focuses on car's lane-changing behaviour, the records of other types of vehicles (i.e., motorcycles and heavy vehicles) were removed. (2) Hollywood Freeway is a five-lane highway with an auxiliary lane and two ramps (i.e., Lane 6, Lane 7, and Lane 8). erefore, records related to these three lanes, which could be seen as mandatory lanechanging behaviour, were removed. (3) Considering vehicles' lane-changing behaviour is quite different when the traffic flow is overly congested or in a free-flow situation, records with spacing headway longer than 30 meters or shorter than 6 meters were filtered out [26].
In addition, a driver's decision time window was empirically considered as 2 seconds [2]; thus, we extract the dataset based on a 2-second interval.
Moreover, we split the process of lane-changing into three phases, namely, lane-keeping (LK) decision horizon, lane-changing (LC) decision horizon, and LC duration. LC point is determined when the centre of the vehicle passes the lane boundary. Details can be seen in Figure 4.
In order to simulate lane-changing decision of autonomous vehicles, the records of lane-changing and waiting were selected randomly from the data. Additionally, there are more lane-keeping cases than lane-changing cases in the dataset. Too many lane-keeping cases will have an enormous effect on model calibration and cause a low predictive rate in Table 2: Payoffs for CAR-TF in the first case of separating equilibrium.

Type
Strategy (CAR-TF, CAR-E) Payoff (CAR-TF) Table 3: Payoffs for CAR-TF in the first condition of pooling equilibrium.

Journal of Advanced Transportation
Update information using p and x.  lane-changing behaviour. erefore, it is important to determine the ratio between the records of lane-changing and lane-keeping. e longer the LK decision horizon is, the more the LK cases are. A sensitivity analysis therefore is performed to determine the LK decision horizon. As a result, this study takes 2 seconds as the LK decision horizon, i.e., the ratio between the two cases are equal. e strategies of CAR-E (i.e., waiting or changing) can be determined by the existence of lane-changing point. However, it is a challenge to extract the strategies of CAR-TF from vehicle trajectories, i.e., signalling or not. erefore, the steady-state regime is adopted to identify CAR-TF's action based on vehicles' acceleration status in the sense that a steady-state regime is achieved when the acceleration or deceleration rate is lower than 0.05 g ("g" is the gravitational acceleration) [27]. When the acceleration or deceleration is smaller than 0.05 g, CAR-TF's strategy can be seen as "not signalling"; otherwise it is treated as "signalling." Due to the limitation of trajectory data, it is impossible to find out the exact time and position when a LC decision is made. However, it can be derived from LC duration since the LC point is easy to be determined. LC duration is normally defined as the time taken by completing LC manoeuvre. Many researches have been done to figure out the LC duration (Table 6). In this study, 4s is adopted, which is used to find out the endpoint of LC decision. en, the start point for LC decision can be determined according to the 2-second decision time window.
Finally, a total of 741 observations were selected, including 391 lane-changing records and 350 lane-keeping records. Table 7 shows the distribution of extracted strategies.

Model Calibration
e calibration of the proposed model is implemented in the way that we estimate the parameters in order to minimize the difference between observed decision in the NGSIM dataset and the decision predicted by the proposed model. is paper adopts the calibration framework established by Liu et al. [35], which was also applied by Ali et al. [2] and Kang and Rakha [36].
As shown in Figure 5, the whole problem is divided into two levels: upper and lower level. e upper level is used to narrow the difference between predicted decisions and observed decision, while the lower level aims to find a full set of equilibrium for this game. Before programming, decisions need to be extracted from the original observed data. Let k be the iteration index. At the first stage, a set of parameters are initialized. By applying these parameters into payoff functions defined before, strategies can be found for each vehicle using the PBE method. en, we calculate the difference between predicted strategies and the actual decisions in upper level. ese steps are iterated and do not stop until the convergence is reached. e function of the upper problem of this programming is defined as the difference between the observed data and predicted data as below: where n is the number of observations; i is the index of records; and y * i and y i are the predicted action and observed action, respectively. e genetic algorithm is used in the upper level problem to minimize the objective function since it is not limited by the form of functions [37].
On the contrary, the lower level problem is solved according to the perfect Bayesian equilibrium. In order to seek the entire solution set for the Nash equilibrium, tools like Gambit [38] and the Nashpy [39] package in Python [40] have been developed to find the equilibrium of a simple game. However, signalling game is more complicated because of its dynamic feature with incomplete information. erefore, instead of adopting these packages directly, we explore three different equilibriums (i.e., separating equilibrium, pooling equilibrium, and semiseparating equilibrium) to seek the optimal solution in the level problem.
Two indicators were calculated to measure model predictability, namely, recall (also called sensitivity) and F 1 score [41]. ey are widely used in the classification model in machine learning and can be derived from equations (21) and (22):

Journal of Advanced Transportation
where TP denotes the true positive cases, i.e., the observation and the predicted action are both positive; FP denotes the false positive cases, i.e., the observation is negative, while the predicted action is positive; FN denotes the false negative cases, i.e., the observation is positive, while the predicted action is negative.
Using 70% of the data to calibrate the model and the remaining in validation, Table 8 summarizes the results of parameters. We can see from the table that c 1 is lower than c 2 , which means that conservative drivers are more likely to regret than aggressive drivers. e negative ε indicates that distraction exists when CAR-TFs send a signal. Also, it can  [28] 6.28 -Toledo and Zohar [29] 4.6 1.0 to 13.3 Gurupackiam and Jones [30] 4.19 -Cao et al. [31] 2.54 1.0 to 6.8 Sajjad et al. [32] 4.3 -Yang et al. [33] 3.75/4.22 -Ali et al. [34] 5.11 -  be concluded that conservative drivers value distance more than aggressive drivers since f 1 TF is lower than g 1 TF .

Simulation
Based on the calibrated parameters presented above, simulation experiments are performed. For the purpose of compassion, a traditional space-based LC model which only considers gaps between vehicles is also applied. e simulation framework is presented in Figure 6. It can be divided into five parts, lanes initializing, vehicles input and output, manoeuvre controlling, information updating, and performance evaluation. e length of the lane is set to 10 kilometres. It is assumed that a vehicle is randomly distributed in two lanes at the initial time (initializing lanes). And there are three pairs of ramps on roads, where vehicles enter or exit the road randomly every second (vehicle input and output). In addition, the information of vehicles, such as position, speed, and acceleration, is updated every 0.1 second in the simulation (information updating). e car-following part is designed according to equation (23) [42]: where sp is calculated by sp � (− φ + ���������� φ 2 + (4cD s ) )/2c; v i and v i+1 are defined as the instantaneous speed of the following vehicle and the leading vehicle, respectively; D is the distance between two vehicles; D s denotes the safety distance; φ � 0.75s is the reaction time; and c � 0.0070104s 2 /m is the reciprocal of twice the maximum average deceleration of the vehicle. A detailed parameter setting is presented in Table 9. And an example of the simulation scenario is presented in Figure 7 (modified from Anushagj [43]).

Results and Discussion
is section evaluates the model performance using efficiency and safety indicators, which can be presented by the successful rate of LC for the first time (LC rate) and the reciprocal of time to collision (TTC) [44].
7.1. Results of LC Rate. LC rate means that drivers can finish LC successfully after the first attempt when there is a need to change lanes. Figure 8 presents the LC rates of the proposed model and the space-based model under different traffic flow densities. One can see that, with the increasing density, the space headway between vehicles decreases. erefore, LC rates of the two models are both in the trend of declining. However, the LC rates of signalling game-based model are higher than those of the spaced-based model over the period, which means signalling game-based model is more efficient. A possible scenario is when the space headway between autonomous vehicles and regular vehicles is not big enough, a space-based controller chooses to wait. On the contrary, the signalling game-based controller analyses and interacts with the other, taking the best action, i.e., changing lanes. is helps autonomous vehicles understand the traffic information and make a more reasonable strategy instead of being overly cautious, i.e., waiting for bigger gaps.
Note that the rates represented are the results of the first attempt of a vehicle to change lanes at the very first time  when the lane-changing desire emerges. In reality, it is impossible that every driver can always change lane successfully as soon as he or she wants to because there still needs time to find a proper position to cut in. erefore, it is comprehensible that the rates of two models are only 20% or so. In conclusion, the model established is superior to the space-based model in terms of efficiency.

7.2.
Results of TTC − 1 . TTC refers to the time that a target vehicle can use by adjusting its own speed to avoid collision with the preceding vehicle. Its reciprocal can be calculated by using the below equation: Entry Exit where v F is the instantaneous speed of the following vehicle; v L is the instantaneous speed of the leading vehicle; and D is the distance between the two vehicles. Obviously, when the reciprocal of TTC is negative, the speed of the following vehicle is lower than that of the leading vehicle, so there is no danger of collision. Conversely, when the reciprocal of the collision time is positive (i.e., the speed of the following vehicle is greater than that of the leading vehicle), the risk of collision increases as the value increases. Figure 9 shows the results under different densities of traffic flow. e reciprocal of TTC of the two lanes stays relatively steady as density changes. us, the performance of signalling the game-based model is almost as good as the spaced-based model in terms of safety.
In order to examine the reliability of the above findings, we further compare the results of TTC − 1 of the two lanes under different scenarios of the decision-making controllers.
e results are shown in Table 10. It can be seen that the 85 th   percentile and the 15 th percentile of TTC − 1 of the basic space-based model are slightly higher than those of the signalling game-based model, indicating that the proposed model results in a relatively low value of TTC − 1 . In another word, the proposed model not only has higher efficiency (i.e., higher lane-changing rate) but also provides a relatively safer traffic condition.
To evaluate the performance of the proposed model in its stability, the model was run under different scenarios regarding the varied ratios of aggressive drivers (p) and traffic flow densities. Figure 10 and Table 11 show the results. It can be found that TTC − 1 values stay at a stable level under different scenarios, and the LC rate of the proposed model are in general higher than that of the space-based model. erefore, we can conclude that the performance of the signalling game-based model is stable under different values of p.

Conclusions
e main emphasis of this paper is to establish an integrated lane-changing model for autonomous vehicles under mixed traffic flow condition by applying the signalling game approach. Several simulations were performed to evaluate the performance of the proposed model. e results show that the proposed model has higher lane-changing rates than the space-based model and remains almost a similar level of TTC − 1 value at the same time under different densities. erefore, we conclude that the proposed model improves the efficiency of lane-changing without decreasing the safety. Also, different ratios of the two types of drivers (conservative  or aggressive) are set to test the sensitivity of the model. One can see that the signal game-based model is stable to the varying ratios, which means that it can be used in different areas with different composition of drivers.
In spite that the proposal model outperforms the spacebased model, there is still room to further improve the model. One possibility is to relax the assumption that only two types of drivers exist. is will unavoidably increase the complexity of the model. In addition, the data of mixed traffic flow involving autonomous vehicles do not exist in the NGSIM dataset, neither in many other available datasets. However, since autonomous vehicles are trained to imitate human behaviour, using traditional data to calibrate autonomous vehicles should not induce great difference. However, this should not rule out the possibility to recalibrate the proposed model when the data of connected vehicles become available.
Nevertheless, the proposed model shows its potential to be applied in complex situations, such as modelling the interactions and conflicts between connected vehicles with more driving styles and between vehicles with insufficient information of surroundings. When dealing with these cases, the scope of the game needs to be expanded further, e.g., add the behaviour of leading vehicles into the model, quantify the impacts of different driving behaviours on the process of lane-changing decision, and refine the payoff functions related to human psychology. It will also be interesting to consider the impact of the weather conditions and the appearance of road obstacles on the proposed model. We consider these aspects as our future work.

Data Availability
Readers can access the data underlying the findings of the study from the corresponding author (zhangmr@ chd.edu.cn).

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.