Modeling Stochastic Route Choice Behaviors with Equivalent Impedance

A Logit-based route choice model is proposed to address the overlapping and scaling problems in the traditional multinomial Logit model.Thenonoverlapping links are defined as a subnetwork, and its equivalent impedance is explicitly calculated in order to simply network analyzing. The overlapping links are repeatedly merged into subnetworks with Logit-based equivalent travel costs. The choice set at each intersection comprises only the virtual equivalent route without overlapping. In order to capture heterogeneity in perception errors of different sizes of networks, different scale parameters are assigned to subnetworks and they are linked to the topological relationships to avoid estimation burden. The proposed model provides an alternative method to model the stochastic route choice behaviors without the overlapping and scaling problems, and it still maintains the simple and closed-form expression from the MNL model. A link-based loading algorithm based on Dial’s algorithm is proposed to obviate route enumeration and it is suitable to be applied on large-scale networks. Finally a comparison between the proposed model and other route choice models is given by numerical examples.


Introduction
Stochastic route choice models capture the perception errors in drivers where they are assumed as rational choosers to minimize their perceived costs rather than actual least costs.It provides the probability that each path is chosen, and thus the future traffic demand can be calculated for transportation planning and management.Multinomial Logit model (MNL) is a basic type of Logit model and used to model relationships between a polytomous response variable and a set of regressor variables.Under the assumption, the probability of Multinomial Logit (MNL) route choice model really depends on the route's costs or what is called impedance.Alternatively, the Logit model can be formulated as a mathematical programming problem in equilibrium traffic assignment, which includes an extremely large number of available routes [1].Dial [2] proposed an efficient algorithm for fixed cost network, which can be used to solve Logit route choice problem by method of successive averages [3].Recently, pathbased solution algorithms have been proposed for Fisk model [4].Dial's algorithm is very efficient in solving Logit network loading problem by considering only subset of alternative routes, namely, "reasonable" routes.However it was found that some routes with smaller cost are excluded while the larger ones are included in the reasonable paths and Dial's algorithm sometimes fails in the real world application due to the deficiency of Dial's "reasonable" route [5].All route Logit models have been proposed to address this problem, which consider all possible routes including those routes with finite and infinite loops so that the number of routes is infinite [5][6][7].Including loops in all route models makes the IIA problem even worse.Since too many links are defined as "unreasonable" in Dial's algorithm, Li et al. [8] proposed a topological scan based algorithm to find more "reasonable" routes by removing "unreasonable" link only when cycle exists.Other efforts to solve "reasonable" issue include algorithm excluding all cyclic flows [9], but it is difficult to be applied to the large-scale networks because the algorithm requires route enumeration.
There are two main deficiencies of the Logit route choice model due to its Independently and Identically extreme value Distributed (IID) assumption; consequently, the MNL model 2 Mathematical Problems in Engineering cannot interpret the correlated degree of paths because of its IIA (independence of irrelevant alternatives) property, which leads to enlarged probabilities for correlated routes; secondly, it cannot represent the heterogeneity in perception errors which would produce unreasonable results, namely, the scaling problem.The improvements for the first issue, the correlation of paths, fall into three categories.
(1) MNL Model with Utility Correction.One correction term is added to the utility function to account for the correlated degree of paths.The motivation is that the unnecessarily higher choice probabilities are given to the correlated routes, so additional costs for the correlated degree can reduce their attractiveness.The C-Logit model was proposed by introducing an attribute called commonality factor (CF) to interpret the correlated degree of routes [10].The value of the CF term is proportional to the correlated parts.The Path Size Logit (PSL) model was proposed with a similar idea [11,12].The correction term, called the Path Size (PS), is derived from the property of the Gumbel distribution to aggregate alternatives.A Path Size Correction (PSC) term is also proposed similarly to the PSL model [13].However the PSL and the PSCL (Path Size Correction Logit) models might be sensitive to the composition of the route choice set [13][14][15].
(2) Nested Structure.The motivation is to categorize the routes into a nest if they share the same links.The linkbased Cross Nested Logit (CNL) is the most wildly used CNL route choice model [16][17][18].It treats each link as a nest, and the routes sharing the same link belong to the same nest.Some researches provide approximated formulas for the CNL model [16,18].Besides, some researchers suggest that the parameters can be achieved by solving a system of equations of the correlation and constraints [19,20].The Paired Combinatorial Logit (PCL) model also has a nest structure and it processes routes in pairs [21][22][23].The correlated degrees of paths are captured by the similarity index and the specifications are provided by Gliebe [24], Prashker, and Bekhor [25].
(3) Other Distributions.The mixed Logit [26][27][28] and the probit are most commonly used [29][30][31].The mixed Logit model incorporates other distributions, mainly, the normal, into the logit model to interpret correlated degree of paths.The probit model is assumed to be normal distribution but it does not have the closed-form expression when there are more than three alternatives.Their estimation and prediction all require the simulation-based methods.Researches [18,28] show that the simulation-based method requires a large number of draws to achieve stable predictions.Besides, currently there is no efficient path-based SUE traffic assignment for solving the route choice model [32].
Regarding the scaling problem, Pravinvongvuth and Chen [32] proposed an origin-destination specific scaling (dispersion) factor to represent the different scale of diverse networks.Chen et al. [23] examine the scaling effect when applying route choice model in stochastic equilibrium models.Miwa et al. [33] examine how to set the scale parameter (dispersion parameter) and apply a multiclass stochastic user equilibrium (SUE) assignment model to consider differences in drivers' perception errors.The relative impedance was proposed based on the same motivation and improved models are derived from the properties of extreme value distribution [34].The CNL model demonstrates that each link has its own scale parameter to interpret the perception error.In application it is usually fixed to the relationships among linkpath topology [16][17][18].
Despite weakness of Logit route choice models, Logit route choice is widely used because it has its closed-form probability expression and an equivalent mathematical programming formulation and can be solved by efficient algorithm.For example, Logit route choice model has been applied to dynamic stochastic route choice [35,36].
In the following sections, a Logit-based route choice model is proposed to reduce the effects of overlapping problem by using a new Logit dispersion parameter setting and a new definition of Logit equivalent link, followed by an efficient network loading algorithm based on Dial's algorithm to obviate route enumeration.Numerical examples are followed to compare the proposed model with some previous models.The final section concludes.

Methodology
2.1.Logit Parameters.The probability distribution of Logit route choice model depends heavily on the Logit dispersion parameter; moreover, the probability distribution only depends on the difference between alternatives and is irrelevant to mean travel costs of routes.Consider a road network (, ) comprising a set of nodes  = {} and a set of links  = { = ( → ) | ,  ∈ }.Let   denote the travel time on link , and then the route travel time   on route  connecting origin  and destination  can be calculated as follows: where  , = 1 if link  is a part of route ;  , = 0, otherwise.The probability that drivers choose route  is where   is the set of all routes connecting origin  and destination ;  is Logit dispersion parameter.Consider three networks in Figure 1 with travel time 100 and 105.The probabilities of route choice by different methods are given.
It is natural to conclude that the probability to choose upper route is close to that of lower route in Figure 1(a) and more drivers should choose upper route in Figure 1(b).However, Logit route choice model gives the same probabilities for both networks: the probability to choose upper route is 92% and lower route is 8% for all three networks when the dispersion parameter  of Logit model is set to be  = 0.5 in (2), and the probability is unreasonable for network in Figure 1(a) but reasonable for network in Figures 1(b) and 1(c) in general.This deficiency can be addressed by the normalized travel cost according to minimum OD travel time; that is, the dispersion parameter  in ( 2) is replaced by where   is constant for whole network and   is the minimum travel time between origin  and destination .Under new parameter setting, (2) becomes The value of the new parameter   is suggested to be 3∼4 for general networks in previous studies.By setting   = 2.5, the probabilities keep unchanged in Figure 1(b) while the probabilities in the network in Figure 1(a) move to a more encouraging pattern of (53%, 47%) as shown in Figure 1.However, the new parameter does not completely solve the overlapping issue of Logit route choice model; an unexpected probability distribution is observed in the network in Figure 1(c) under new parameter setting.Clearly, drivers should not make different choices in the network in Figures 1(b) and 1(c); but (4) predicts that more drivers choose upper route in the network in Figure 1(b) than the network in Figure 1(c).
To overcome the overlapping problem, the Logit parameter in (3) is further modified to where   represents the minimum travel time of subnetwork between node  and node , which replaces   in (3).Now the probability to choose a subroute  connecting  and  is The modification indicates that drivers can make route choices when they arrive at an intersection rather than at the origin, which is more close to the nature of drivers' route choice behaviors.For example, a driver faces two routes with travel times 100 and 105 in the network in Figure 1(c) if he makes decision according to route travel times at node , but his real choice is to choose between routes with travel times 5 and 10 when he arrives at node .
Particularly, one can select a node pair  −  such that the subnetwork connecting − does not contain overlapping route and overlapping part of routes is not considered for route choice.Reconsidering the network in Figure 1(c),  −  can be chosen as there is no overlapping link and the new probability distribution is more reasonable: 92% of drivers choose upper route while 8% chooses lower route.The probabilities of route choices for different Logit parameter settings are given in Figure 1, which shows that the proposed parameter setting produces more reasonable probability distribution than previous models.
The proposed model uses only one parameter for the whole network which can be easily estimated.For example, if investigation gives that 95% of drivers choose the upper route and 5% choose lower route in Figure 1(c), then one has which gives  = 2.94.In fact, this parameter indicates when drivers face two routes with travel costs  and 2, the probability to choose the longer route is 5%.

Logit Equivalent Travel
It must be noticed that  ( → ) =  ( → ) +  ( → ) generally does not hold since there may exist other route connecting ( → ) that is composed of other links like ( → ℎ) and (ℎ → ).(iii) For  parallel links, the equivalent travel cost  ( → ) satisfies the following equation according to definition: which gives (iv) The equivalent travel cost  ( → ) might be negative.
Simply considering a network with three parallel links with travel time of 1 and Logit parameter of  = 1, then the equivalent travel cost is which gives a negative value.
Considering the network in Figure 3(a), the proposed route choice model uses (6) to calculate the probabilities inside a subnetwork of ( → ), and assume  = 2.5, so the equivalent travel time between node pairs ( − ) is Comparing with 32% by multinomial Logit model, it is clear that proposed model gives a more reasonable choice than the original one.
Converting links in series or parallel into equivalent links is easy according to property (ii) and (iii) of equivalent links.For networks with the topological structure shown in Figure 4(a), one can transfer the original network to one in Figures 4(b) or 4(c) so that resulting network contains only links in series or parallel and can be easily merged into equivalent links.
In fact, the original Logit route choice model is a special case of proposed model.In the original Logit route choice model all routes are assumed to be independent of other routes, so every route can be considered a direct link between OD pairs − and the equivalent travel cost between OD pairs  −  is Now drivers can simply make their choice by which turns the same as (2).

Loading Algorithm.
A process similar to Dial's algorithm is presented to find equivalent link in the largescale network with complicated structure, which obviates route enumeration.The subnetworks without overlapped links are repeatedly converted into equivalent virtual links so that there are only parallel routes without overlapping links between given node pairs, and the probability to choose subroute  connecting  and  is given by where   = ∑    ⋅  , and  , = 1 if link  is a part of route ;  , = 0, otherwise.Equation ( 16) is similar to (6) except that the link travel times are replaced by equivalent travel cost.Equation ( 16) gives the probability to choose route  inside the nest, so the probability that route  is chosen in the network is where Pr() denotes the probability that equivalent link ( → ) is chosen.
For simplicity, it is assumed that network contains no multiple arcs; that is, ( → ) represents a unique link in the network.In fact, multiple arcs can be easily converted into an equivalent link by (10).The definition of "reasonable" routes in Dial's algorithm is adopted in this paper, which means link ( → ) with () ≥ () is not used by drivers, where () denote the minimum travel cost from origin  to node .The algorithm is a link-based one and only the overlapping part on the shortest routes is considered when drivers make decisions.Considering the network in Figure 5(a), link ( → ) rather than ( →  → ) is considered as overlapping part because node  is the last common node on the shortest route from  to  and  to  since most of flow arriving at  comes from link ( → ).Equivalently, the original network is converted into one in Figure 5(b); link ( →   ) has all properties of original link ( → ) and flow on ( →   ) is considered a part of flow on ( → ).
The first step is to find the overlapped part of routes connecting two nodes using the shortest route tree from origin.For easy understanding, considering the subnetwork in Figure 6, node  has 4 upstream links and one needs to find the probabilities of links that drivers choose to arrive at node .Suppose that the  1 is the last common node on the shortest route from  to  1 and the shortest route from  to  2 , and then a new equivalent link ( 1 → ) is formed and its equivalent travel time is Similarly,  ( 2 → ) and  ( 3 → ) can be calculated.The probabilities that drivers choose link ( 4 → ) and ( 2 → ) to arrive at  are given by Correspondingly, the probabilities that drivers choose link ( 3 → ), ( 2 → ), and ( 1 → ) to arrive at  are given by ⋅ Pr ( 1 → ) , Recall that Dial's algorithm calculated link flows according to the descending order of (⋅) which implies that the total flows that arrive at node  are known, and the flows of incoming links of node  can be calculated as follows: where  ( → ) is flow on link ( → ) which is given.
It is not so difficult to find the last common node  as shown in Figure 6.However it becomes complicated when network contains a large number of links and nodes, and the network structure is complex.Following the previous analysis, an algorithm is presented here to find those common nodes.The following notations are used: NS() is the number of links on the shortest route from , () is the set of upstream nodes of node , () = { | ( → ) ∈ }, () is the set "reasonable" upstream nodes of node , () = { | ( → ) ∈ },  is the set of the common nodes on the shortest route tree, () is the set of nodes whose common node on the shortest route tree is , V() is the node to record destination node of node .
The procedure is as follows.
Step 2. Find  ∈ () such that NS() = max ∈ NS() and remove  from .Let  be the preceding node on the shortest route.
Step 5.If the number of elements in  is 1, stop; otherwise, got to Step 2.
The size of () should be less than 4 for normal road network so that computational requirement for the process is very small.Considering the network in Figure 6, one can easily get The process for network loading is stated as follows.
Step 1. Initialization: perform a shortest route calculation for origin , and generate () and NS() for each node  ∈ .
Step 2. Find the probability of route choices.For each node  in ascending order of (⋅) starting from , find the common node set  and () as described before.For each  ∈ () and () ≥ (), Pr( → ) is set to 0, which indicate the link is "unreasonable".Now for each  ∈  in the descending order of NS(), calculate the equivalent link travel cost as follows: If  ∈  is the node with the smallest of NS(), define Pr( → ) = 1.Then calculate the probability to choose link ( → ),  ∈ () for each  in the ascending order of NS() by the following equation: ⋅ Pr ( → ) . ( Step 3. Calculate link flows.For each node  in the descending of (⋅), calculate the link flows for its incoming link ( → ) as follows: where   = 0 if node  is not a destination node.
In fact the proposed algorithm converts every subnetwork without overlapping links into a virtual equivalent link so that finally there is only one equivalent link connecting OD pair in the network.The proof of equivalence between the proposed model and the algorithm is straightforward.The proposed model uses "reasonable" link as proposed by Dial [2]; for easy understanding, one can implement more "reasonable" route definition to include more links used in the real world [8].The proposed method is calculated from an origin  so that the total calculated burden depends on the number of origins rather than on the number of OD pairs.The computation requirement is similar to the classical Dial's algorithm.

Numerical Examples
The network in Figure 7 The probability utilizing upper route  →  is It is quite interesting that Pr( → ) is independent of Logit parameter .The proposed model produces the probability for upper route which is 50% if  = 0 and 33% if  = 1, which are the same as the Probit model.The original Logit model always gives identical probability of 0.33 for all three routes.Figure 7(b) gives the probability distributions of different models in which data of Probit model was given by Daganzo and Sheffi [37].The second example reveals that it is possible that the longest route is the second or even the first most likely to be used so that simple use of -shortest route method may fail to produce reasonable route choice pattern [37].The network is shown in Figure 8(a).By applying the proposed model, the probabilities that drivers choose the shortest and the longest route is given in Figure 8(b) for different Logit parameters.The drivers are unable to distinguish the upper equivalent link and the lower real link so that the probability to choose the longest route is about 50% when Logit parameter  is very small; the probability converges to zero when  becomes infinite.In normal range of , the new model produces a reasonable result.For example, the probability to choose the longest route is 45.8% if  = 2.5 and it is 34.1% if  = 10.In contrary, the probability to choose the longest route is less than 1/65 when traditional Logit model is applied since there exist 65 routes.The drivers will not use the lower link when -path method is applied and the longest route is not considered.When  is small, the proposed model demonstrates an interesting result that more drivers choose the longest route than the shortest one, which might be the case in the real world as suggested by Daganzo and Sheffi [37].
The network in the last example consists of 8 nodes and 11 links and has only OD pair  −  with traffic demand of 100 as shown in Figure 9(a), which has been used in previous studies [16,38].All of the five alternative routes from  to  have identical travel time of 21, so that multinomial Logit  model yields equal flows of 20 on all five routes.With  = 2.5, Figure 9(b) gives Logit equivalent cost and probabilities to choose links by the proposed model, and a comparison of assignment results is shown in Figure 9(c).It seems that there is no big difference between the multinomial Logit model and the proposed model, as shown in the link-based cross nested Logit model [16].The reason is that all five routes are heavily overlapped mutually so that the effects of overlapping are partially eliminated; therefore, multinomial Logit route choice model may be suitable for network in which routes mutually overlapped.The effects of overlapping become a serious issue in networks in which there exist independent routes against other parts of network, such as the networks in the first two examples.Another interesting finding is that the probability distribution is almost unchanged in this example when Logit parameter is unchanged; this is because all five routes have identical travel time and probabilities are calculated only according to the difference in Logit models.When the travel costs on link ( → ), ( → ), and ( → ) change from 3 to 4, this phenomenon disappears: the larger the Logit parameter is, the more flows are assigned to upper and lower routes.

Conclusions
The Logit equivalent link proposed in this paper makes it possible to simulate the route choice behaviors at every intersection rather than at origin.The subnetwork without overlapping links is considered as a nest and converted into a virtual equivalent link, so that overlapping problem of route  choice can be overcome.Another advantage of the proposed model is tht it uses a holistic parameter to define the route choice behavior of drivers, which is easy to understand and estimate through survey.The proposed model can be also easily combined with the existing efficient algorithm such as Dial's algorithm with very small additional computational efforts, which makes it suitable for large-scale network application.Although numerical example shows that there is no big difference between the proposed model and multinomial Logit model in heavily mutually overlapping network, the proposed model produces significantly better results than multinomial Logit model when there are independent routes against other parts of network.
Future studies include combining the proposed model with other algorithms to address "reasonability" problem of Dial's algorithm.Developing methods for parameter estimation and model calibration and conducting a test of proposed model in a real road network are other interesting points.The proposed model can also be extended to the dynamic route choice model, which can be applied to the dynamic environments such as traffic simulation.

Figure 1 :
Figure 1: Networks to demonstrate Logit parameter setting.Travel costs (probabilities computed by Dial/OD-dependent/new model).

Figure 3 :
Figure 3: A simple network to demonstrate equivalence.

Figure 4 :
Figure 4: Network transformation of mutually overlapping network.

Figure 5 :
Figure 5: Example of network transformation by the shortest route tree.
(a) is widely used to discuss the overlapping problem of Logit model, which consists of 3 nodes and 4 links and has 1 OD pair  −  and 3 routes.By applying the proposed method, one can calculate the Logit equivalent travel time of two parallel links between  and  by  ( → ) equivalent travel time of two routes  →  →  is  = (1 − ) +  ( → ) = (1 − ) + (1 Comparison of probability distributions of different models

Figure 7 :
Figure 7: Network to discuss route overlapping.
Network in which the longest route may be the most likely used Probabilities of the shortest route and the longest route

Figure 8 :
Figure 8: Analysis of the dispersion parameter.
Comparison of multinomial Logit model and proposed model

Figure 9 :
Figure 9: One OD network with 8 nodes and 11 links.
In other words, the original subnetwork can be replaced by link ( → ) without changing the drivers' route choice on the other part of network.The concept is illustrated in Figure2, in which half of drivers should choose the original network and another half should choose the newly added equivalent link.The equivalent link can be considered as a nest which consists of all links in the subnetwork; the route choices made inside the nest are independent of the other part of network.A multilevel nested-style Logit route choice model is established when one repeatedly converts nonoverlapping subnetworks into equivalent links.It must be noted that the Logit equivalent travel costs should not be mixed with travel time: the Logit equivalent travel costs are only used to calculate the probabilities of route choices, although they have almost all properties that link travel times have.By definition, the equivalent travel time of subnetwork has the following properties.The Logit equivalent travel cost of single link is the link travel time itself; that is,  ( → ) =  ( → ) for a single link subnetwork ( → ).(ii) The Logit equivalent travel cost for a route is additive; that is, if two links ( → ) and ( → ) are in series, either original or equivalent, they can be combined into one equivalent link with Logit equivalent travel cost given by Time.To easily understand the proposed model, two new concepts, namely, Logit equivalent link and Logit equivalent travel cost, are defined.Considering a subnetwork that connects nodes  and , a virtual link ( → ) is added to the original network so that the probability that drivers choose the original subnetwork is the same as that of virtual link ( → ) under Logit route choice assumption, then link ( → ) is called the Logit equivalent link of original subnetwork, and link travel cost  ( → ) is defined as Logit equivalent travel cost of original subnetwork.