A Study of Driver's Route Choice Behavior Based on Evolutionary Game Theory

This paper proposes a route choice analytic method that embeds cumulative prospect theory in evolutionary game theory to analyze how the drivers adjust their route choice behaviors under the influence of the traffic information. A simulated network with two alternative routes and one variable message sign is built to illustrate the analytic method. We assume that the drivers in the transportation system are bounded rational, and the traffic information they receive is incomplete. An evolutionary game model is constructed to describe the evolutionary process of the drivers' route choice decision-making behaviors. Here we conclude that the traffic information plays an important role in the route choice behavior. The driver's route decision-making process develops towards different evolutionary stable states in accordance with different transportation situations. The analysis results also demonstrate that employing cumulative prospect theory and evolutionary game theory to study the driver's route choice behavior is effective. This analytic method provides an academic support and suggestion for the traffic guidance system, and may optimize the travel efficiency to a certain extent.


Introduction
In recent years, with the rapid development of information technology, traffic information system has had a great effect on travel decision-making behavior. Drivers may respond to the information through adjusting the travel mode, destination, departure time, and speed, but most commonly by altering routes [1][2][3][4][5]. The aim of this work is to propose such an analytic method that is able to take traffic information into account to explore the mechanism of route choice behavior.
Researches related to route choice have been conducted in many perspectives. Chen and Jovanis [6] and Polydoropoulou et al. [7] claimed that drivers' attitudes towards communication, technology, and transportation system reliability affected their route decision-making process. Jan et al. [8], Li et al. [9], and Srinivasan and Mahmassani [10] found that the ultimate route choice decision was inherently a multipleobjective behavior. They considered many factors other than the conventional measurement variables and demonstrated that the factors had a major impact on route decision-making process. Bogers et al. [11] and Ben-Elia et al. [12] constructed simulation experiments to explore the influences of information, learning, and habit on choices between two routes. Chorus et al. [13] presented a discrete choice model to research driver's responses to VMS. The model indicated that the preferences and beliefs had significant impacts on driver's choice behavior. Ben-Elia and Shiftan [14] conducted a laboratory controlled experiment to model the route choice behavior when information was provided in real time. The results showed that information and previous travel experiences had a combined effect on driver's route choice behavior. Kusakabe et al. [15] conducted a SP survey to investigate the effects of traffic incident information provided on VMS on driver's route choice behavior. The results showed that drivers assumed the travel time of their alternative routes according to the incident information of the road section provided by VMS. Ben-Elia et al. [16] conducted a route choice experiment to investigate the impact of the accuracy of traffic information on route choice. The results suggested that decreasing accuracy shifted choices mainly from the risk to the reliable route but also to the useless alternative.
The above researchers studied the route choice behavior in the perspective of expected utility theory (EUT) [17] or random utility theory (RUT) [18][19][20][21]; little work has been done from the point of bounded rational. Drivers evaluate the alternative routes by individual experience, cognition, and attitudes which are not considered in the EUT and RUT models. Hence, many alternative theories are proposed, for example, prospect theory (PT) [22], cumulative prospect theory (CPT) [23], rank-dependent expected theory [24], regret theory [25], and behavioral portfolio theory [26]. Among them, CPT describes the bounded rational behaviors under risk and uncertainty preferably, so it draws the most attention.
Looking at the issue from another point, route choice is a dynamic selection process because of the real-time traffic information and the updated road condition. Little work has been done from the point of dynamic selection process to discuss how drivers make route choice decisions considering traffic information. Evolutionary game theory is the theory that discusses system's dynamic evolution process under bounded rational conditions. The purpose of this paper is to describe how drivers adjust their route choice behaviors under the influence of traffic information from a bounded rational and dynamic selection process perspective. The remainder of the paper is organized as follows. Section 2 describes the basic theories applied in this paper, including cumulative prospect theory and evolutionary game theory. In Section 3, a network with two alternative routes is constructed to model the drivers' route choice behaviors and the route choice model derived from CPT is established. The analysis of the equilibrium network state is given in the following. Limitations of the proposed modeling method and the further research directions are discussed in Section 4.

Theory Preliminaries
2.1. Cumulative Prospect Theory. Cumulative prospect theory (CPT) is a method for descripting decisions under risk and crisis which was introduced by Tversky and Kahneman in 1992. CPT distinguishes the choice process into two phases: framing and valuation. In the phase of framing, the decision maker constructs a representation of the acts, contingencies, and outcomes that are relevant to the decision. In the phase of valuation, the decision maker assesses the representation value of each prospect and chooses the largest one accordingly [23].
The main opinion of CPT is that people tend to think of possible outcomes relative to a certain reference point rather than to the final status, a phenomenon which is called framing effect. Moreover, they have different risk attitudes towards gains (i.e., outcomes above the reference point) and losses (i.e., outcomes below the reference point) and care generally more about the potential losses than the potential gains. Finally, people usually overweigh the extreme, but unlikely, events, however, underweigh the "average" events.
CPT incorporates these opinions in a modification of the expected utility theory by replacing the final wealth with the payoffs relative to the reference point, replacing the utility function with the value function that depends on the relative payoffs, and replacing the cumulative probabilities with the weighting cumulative probabilities. The subjective utility of a risky outcome is described by a probability measure : where V( ) is the value function (typical form shown in Figure 1) and ( ) is the weighting function ( Figure 2) and ( ) = ∫ −∞ .  [23] is employed in our work. It is expressed as follows: where is the outcome relative to a certain reference point. and are the estimation coefficients which determine the convexity or concavity of the value function shape. is the loss aversion coefficient. Both and fall between 0 and 1; particularly, = = 1 represents the pure loss aversion. should be larger than 1 to describe the degree of loss aversion and to resemble the S-shape in Figure 1.
It is apparent from Figure 1 that the value function is convex above the reference point (V ( ) ≤ 0, ≥ 0) and concave below the reference point (V ( ) ≥ 0, ≤ 0). It is steeper for losses than for gains (V ( ) < V (− ) for ≥ 0).

Weighting Function.
Based on the research of Tversky and Kahneman [23], the weighting function is defined by two inversely S-shaped formulations: where + and − represent the weighting function for gains and losses, respectively. and indicate the level of distortion in probability judgment and they should fall between 0 and 1. Decreasing and causes the shape of the weighting function to become more curved and to cross the 45-degree line farther to the right. Figure 2 presents the shape of weighting function. + and − are strictly increasing functions from the unit interval into itself satisfying + (0) = − (0) = 0 and + (1) = − (1) = 1 [27].

Cumulative Prospect
Value. Based on CPT, the representation value of a prospect is represented as follows: where ( + ) is the cumulative value of the prospect gains and ( − ) is the cumulative value of the prospect losses. ( ) is the function of decision weights and value function V( ) and it is defined as follows:

Evolutionary Game Theory.
Evolutionary game theory (EGT) is a theory that combines game theory with dynamic evolution process analysis. EGT is useful in this context by defining a framework of contests, strategies, and analytics into which Darwinian competition can be modelled. EGT originated in 1973 with Smith and Price's formulization of the way in which such contests can be analyzed as "strategies" and the mathematical criteria that can be used to predict the resulting prevalence of such competing strategies [28]. EGT differs from the classical game theory by focusing more on the dynamics of strategy change which is influenced not solely by the quality of the various competing strategies, but also by the effect of the frequency with which those various competing strategies are found in the population.

Evolutionary Stable Strategy.
Evolutionary stable strategy (ESS) was defined and introduced by Smith and Price in a 1973 Nature paper [28]. An ESS is a strategy which, if adopted by a population in a given environment, cannot be invaded by any alternative strategy that is initially rare. The "evolutionarily" stable is a Nash equilibrium solution; once it is fixed in a population, natural selection alone is sufficient to prevent alternative strategies from invading successfully. ESS presumes that individuals have no control over their strategies and need not be aware of the game. To be an ESS, a strategy must be resistant to alternatives. Every ESS corresponds to a Nash equilibrium solution, but not all Nash equilibrium solutions are ESSes.
The mathematical definition of ESS can be expressed as follows. For a very small positive , every ̸ = * meets the following condition: That is to say, for a small proportion of mutation behavior in population, taking strategy * will get higher utility, and the stable state as a result of strategy * cannot be invaded by a small mutation. Then the strategy * is the ESS. It is noteworthy that the mutational strategy is the strategy which is different from the strategy sets.
There are two properties for a strategy * to be an ESS. For all ̸ = * , 4 Computational Intelligence and Neuroscience The first property is called a strict Nash equilibrium solution.
The second property means that although strategy is neutral with respect to the payoff against strategy * , the population of players who continue to play strategy * has an advantage when playing against . The limitation of ESS is that it is a static equilibrium without considering the dynamic evolutionary process. The stability equilibrium of evolution should be associated with the specific evolutionary process.

Replicator Equations.
The common methodology to study the evolutionary process is through the selection dynamics. It shows the growth rate of the proportion of people using a certain strategy. The basic expression of the selection dynamics is presented aṡ where ( ) is the proportion of the people who choose strategy at time , ( ) represents the specific selection process, and different learning mechanisms correspond to different function forms. The primary characteristic of the selection dynamics is that the pure strategy taken by no one in the initial state will never be used. Participants can only imitate the existing strategies; that is, the strategies did not reflect the mutation. This feature can be expressed in mathematics as follows: Among all kinds of game dynamic schemes, replicator dynamics (RD) by Taylor and Jonker [29] is most widely researched and a lot of relative conclusions have been obtained. The replicator dynamics is presented aṡ In RD, each participant is on behalf of one kind of group with a uniform population distribution and the participants insist on taking a pure strategy . The growth rate / of the proportion taking the pure strategy is the strictly increasing function of the difference between the payoff ( ) and the average payoff ( ).

The Relationship between CPT and EGT.
Cumulative prospect theory and evolutionary game theory deal with bounded rationality from two different perspectives: the former tries to handle individual irrationality from the perspective of psychological perception, while the latter focuses on the limited rationality in selection and decision [30]. The research results of CPT reveal the fact that people tended to magnify small probabilities and to minify large probabilities and they are more sensitive to losses than to gains of the same quantity. Evolutionary game theory interprets the mechanism that players are programmed to follow a certain choice scheme to behave or react according to the current system state. The process of looking for participants' strategies is the main point of evolutionary game theory as a kind of theory that researches the laws of decision.

Route Choice Model Formulation.
In this section, we will take a two-route network, for example, to illustrate the route choice modeling process. The network consisted of route and route . Route is the shortest route and route is the recommended route provided by VMS when there are congestions in route . The length of route is longer than route , and there is detouring distance when switching from route to route . The VMS is installed near the "O" point to display the real-time traffic information (see Figure 3). We assume that there are two types of drivers' distributions in this network. The first type prefers the shortest route as their route choice decisions. We call these drivers rigid demand drivers. The other type is prone to switching to the recommended route, and we call these drivers flexible demand drivers. Under the influence of traffic information, all drivers condition their route choice decisions on their perceptive travel time (payoff) of each possible route. The flow chart of the route choice modeling process is exhibited as Figure 4.
Step 1 (determine the cumulative prospect value of each alternative route). For a specific road network, drivers determine the perceptive time of each alternative route based on their previous travel experiences. The travel time distribution of each alternative route is assumed to be identical and independent of each other. According to the central-limit theorem, the distribution of the perceived travel time of the alternative route approximately obeys the normal distribution. The distribution of the perceptive travel time is written as follows: where is the perceived travel time of route ; is the average travel time of route ; is the travel time route ; 2 is the travel time variance of route ; is the number of travels.
Because the free flow time can reflect the physical properties of the route in a certain extent, the reference point in this research is defined as the average value of the free flow time Step 1 Step 2 Step 3 Step 4 of all alternative routes. The reference point is represented as follows: where 0 is the reference point; free is the free flow time of route ; is the number of the alternative route. In this network, = 2.
Based on CPT, we assign each route a value ( ). The ( ) is the cumulative prospect value. The value function of two alternative routes can be obtained based on (2). It is worth noting that in (2) is expressed as 0 − in our research. Based on the probability of each , the weighting function can be obtained by (3). Thus, the cumulative prospect values of the two routes ( ) and ( ) are calculated by (4) to (8), respectively.
Step 2 (determine the payoff under different decision conditions). During a travel activity, the variable message signs are used to provide travel related information in real time. Each type of drivers has two route choice strategies: 1 : choose the route of the shortest route (route ); 2 : choose the recommended route (route ).
According to the difference of individual preference, we assume that participant 1 consisted of the rigid demand drivers and participant 2 is composed of the flexible demand drivers. 1 and 2 play game in this transportation system; the object of them is the individual utility maximization for each other. During the game, the choice result is not determined in advance but changes as the study process and the driver's strategy adjustment due to their experiences and the realtime traffic information [31]. The payoff of each participant under different decision conditions is represented as follows:  Table 1 is the payoff matrix. When the two participants choose different routes, the participant who chooses route will benefit from the good traffic condition while the participant choosing route will get losses because of the increased detouring distance. Considering the rigid demand of 1 , we can conclude that 1 > 2 and 2 > 1 . If all the participants choose route , because of the individual preference difference, the utility reduced degree of 1 is bigger than the utility reduced degree caused by the case that 1 chooses route and 2 chooses route ; that is, 3 > 2 . Similarly, the conclusion that 4 > 1 is drawn. Moreover, the utility reduced degree of 1 from route to route is bigger than that of 2 , that is 2 > 4 . According to the above analysis, it can be summarized that 3 > 2 > 4 > 1 and 1 > 2 .
Step 3 (construct the route choice model). Assume that the probabilities of choosing route of 1 and 2 are and ( , ∈ [0, 1]), respectively. Accordingly, the probabilities of choosing route are 1 − and 1 − , respectively.
In conclusion, the route choice game model which embeds CPT can be expressed as follows: Player: 1 , 2 Strategy set: { 1 , 2 } Payoff matrix: see Table 1.

(16)
Step 4 (dynamic evolutionary analysis). Section 3.3 will discuss the dynamic evolutionary process in detail.

Dynamic Evolutionary
Analysis. The utility of strategy 2 of 1 ( 1 ) consisted of two parts. The first part is 1 's utility that 1 chooses route while 2 chooses route . The second part is 1 's utility that 1 chooses route while 2 chooses route . The utility of strategy 1 of 1 is obtained by the same principle. 1 and 1 are represented as follows: The average utility of strategies 1 and 2 of 1 is the average utility of 1 and 2 . The former utility equals the selected proportion of 1 multiplies the utility of 1 . The latter is the selected proportion of 2 multiplies the corresponding utility. For the sake of convenience in the process of discussion, = ( = 1, 2) is assumed. Then, the average utility of strategies 1 and 2 of 1 is expressed as follows: In evolutionary game theory, the dynamic change rate of strategy proportion is the core of the bounded rational game analysis. The change rate depends on the player's learning ability and learning rate. This process can be represented by the replicator dynamics. The replicator dynamics of strategy 2 to participant 1 is To participant 2 , the utility of strategy 2 ( 1 ) and strategy 1 ( 1 ) is expressed as below, respectively: The average utility of strategies 1 and 2 of 2 is as follows: The replicator dynamics of strategy 2 of 2 is A fixed point of the replicator dynamics is a population that satisfieṡ= 0, ∀ . Fixed point describes the situation that there is no longer evolution. The fixed points of this route choice system are (0, 0), (0, 1), (1, 0), (1, 1), and ). We utilize Jacobin matrix to discuss the ESS under different evolution paths.
The Jacobin matrix is The determinant of the Jacobin matrix is The trace of the Jacobin matrix is (1) = 0. The practical meaning of = 0 is that there is no installation of VMS in the transportation system. There are 4 equilibrium points under this scenario, and they are (0, 0) (0, 1) (1, 0) (1, 1). The evolutionary equilibrium analysis result is illustrated in Table 2.
From Table 2 it can be seen that there is a strictly dominant pure strategy (0, 0), so it is the ESS. The ESS means that, in the long run, the system will tend to the evolutionary stable state that the proportion of strategy 2 of 1 and 2 is = 0, = 0, and the stable state will not be disturbed by a small portion of mutation. In other words, all the drivers will choose the shortest route (route ) when they cannot get    Equilibrium point det J Sign of det J tra J Sign of tra J Local stability (0, 0) ( − 2 ) ( − 1 ) + ( − 2 ) + ( − 1 ) + I n s t a b i l i t y (0, 1) (2) 0 < < 1 . The practical meaning of 0 < < 1 is that the travel efficiency loss caused by the congestion that all of the drivers choose route is small enough. The equilibrium points are (0, 0) (0, 1) (1, 0) (1, 1). From the evolutionary equilibrium analysis (Table 3), (0, 0) is the strictly dominant pure strategy. It suggests that when the travel efficiency losses are small, all the drivers will choose route , and the transportation system will progress toward the evolutionary stable state that the proportion of 1 and 2 selecting route is = 0 and = 0.
(3) 1 < < 2 . The practical meaning of 1 < < 2 is that the travel efficiency loss shown on VMS is between 1 and 2 . In this scenario, (0, 0) (0, 1) (1, 0) (1, 1) are the equilibrium points. The evolutionary equilibrium result is shown in Table 4. Table 4 shows that the system with equilibrium point (0, 1) is local stable, and this strategy is ESS of this dynamic route choice system. Under the influence of information, the system will develop towards the evolutionary stable state that 1 chooses route while 2 chooses route .
(4) > 2 . The practical meaning of > 2 is that the travel efficiency loss caused by the congestion of route is great. There are four equilibrium points under this scenario, and they are (0, 0) (0, 1) (1, 0) (1, 1). The evolutionary equilibrium analysis result is illustrated in Table 5.
From Table 5, the following conclusion can be drawn that the dynamic system has two pure strategies, and they are (0, 1) and (1, 0). The two equilibrium points are both ESS. It means that when the drivers in the system get the information relevant to the great losses, the system state will progress towards the evolutionary stable state that = 0 and = 1 or = 1, = 0. That is to say, under the influence of traffic information, one of the participants will switch to route , while the other driver will persist in the choice of route . The dynamic system will progress to stable state which is related to the initial value of the payoff matrix.
then then then y * ∈ [0, 1] The above uncertainty can be solved by the stability theorem. The stability theorem of differential equations to distinguish the different stable states can be expressed in mathematics: ∀ > * , considering that ( ) = / < 0, that is, ( * ) < 0, thus the system is stable on the point * .
Take * = 0 and * = 1 into ( ) and judge the stability of the system at the equilibrium point according to the result of ( * ). The proportion of 1 choosing route will eventually progress to different stable states depending on the different initial values of route selection proportion of 2 .
< 0, and the stable point of the system is * = 1. It means that the proportion of choosing route of 1 will be stable at 100% as time changes.
The group replicated dynamic phase of 1 is exhibited in Figure 5. Take * = 0 and * = 1 into ( ); the changing process of is analyzed in Figure 6. When = ( 1 − )/( 1 − − 2 − 4 ), ( ) ≡ 0. It indicates that whatever the initial proportion of 2 choosing route is, the dynamic transportation system is stable.
< 0, and the system stable point is * = 1. It reveals the proportion that participant 2 choosing route will increase to 100% as time goes by.
The stability of groups 1 and 2 is illustrated in Figure 7.
When the initial state of and is in region A, ESS is * = 0, * = 1, and the dynamic transportation system will evolve towards the stable state that 1 chooses route while 2 chooses route . When the initial state is in region C, ESS is * = 1, * = 0, and the system will stabilize at the state that 1 chooses route while 2 chooses route finally. When the initial state is in regions B and D, the direction of evolution is uncertain. It may evolve to region A and converge to (0, 1) or evolve to region C and converge to (1, 0).

Discussion and Conclusion
This paper has embedded cumulative prospect theory into evolutionary game theory in order to integrate the individual perception and decision schemes with the group learning and evolutions. This paper discussed the drivers' route choice behaviors and the corresponding stable state of the dynamic traffic system according to the different information shown on VMS.
When there is no VMS in the transportation system ( = 0), all the drivers choose the shortest route (route ). When the travel efficiency losses displayed on VMS are small enough (0 < < 1 ), the impact of VMS on route choice is indistinctive. The result of the evolution analysis turns out to be that all the drivers still choose the shortest route. When the travel efficiency losses shown on VMS are appropriate ( 1 < < 2 ), the transportation system progresses to the evolutionary stable state, in which the drivers with rigid demand choose route while the drivers with flexible demand choose route . When the travel efficiency losses value is big enough ( > 2 ), the analytic result suggests that the drivers are sensitive to the efficiency losses, and the transportation system progresses towards the evolutionary stable state that one type of the drivers chooses route and the other type chooses route . Our findings indicate that the stable state of the dynamic route choice system is sensitive both to the traffic information and to the initial state of the transportation system.
However, there are some limitations in this research. First, the modeling method presented here is effective but needs to be validated in the empirical work. Another issue is that our findings are valid only for the assumption that the distribution of driver's characteristic is identical in the same participant.
We suggest that the survey data should be collected in order to calibrate the parameters of the proposed model and to investigate the capability of the model to explain the field observations. In the future, an investigation on the effect of the drivers with different characteristic distributions should be carried out.
The results of this study may be useful to learn the driver's route choice behavior and to alleviate the urban traffic congestion. The potential applications of the proposed method involve the modeling and describing the group choice evolution process from the perspective of the individual risk attitude as well as the decision-making schemes. It is suitable for capturing the adaptation course of the group choice.