A Two-Phase Gradient Projection Algorithm for Solving the Combined Modal Split and Traffic Assignment Problem with Nested Logit Function

(is study provides a gradient projection (GP) algorithm to solve the combined modal split and traffic assignment (CMSTA) problem. (e nested logit (NL) model is used to consider the mode correlation under the user equilibrium (UE) route choice condition. Specifically, a two-phase GP algorithm is developed to handle the hierarchical structure of the NLmodel in the CMSTA problem. (e Seoul transportation network in Korea is adopted to demonstrate an applicability in a large-scale multimodal transportation network. (e results show that the proposed GP solution algorithm outperforms the method of the successive averages (MSA) algorithm and the classical Evan’s algorithm.

Various formulations, ranging from mathematical program (MP), nonlinear complementarity problem (NCP), variational inequality (VI), and fixed point (FP), have been provided to represent different combined models listed above. Please refer to [6,12,16,17] for a review of the different formulation approaches for modeling CDA, CMSTA, and combined travel demand models. ese combined models have been adopted to represent different emerging technological applications, such as mixed gasoline and electric vehicles with destination, route, and parking choices [16], electric vehicle charging stations using the CDA model [17], and ridesharing as a new mode choice option using the CMSTA model [18]. Most of the above studies adopted either the complete linearization method of the Frank-Wolfe algorithm [19] or the partial linearization method of Evan's algorithm [2] for solving these emerging technological applications. Computational results conducted by LeBlanc and Farhangian [20] revealed Evan's partial linearization method performed better than Frank-Wolfe's complete linearization method, while Ryu et al. [21] recently demonstrated the superiority of the gradient projection (GP) algorithm over Evan's algorithm. Note that very few studies have focused on developing solution algorithms for solving large-scale problems with multiple modes in a multimodal transportation network [24][25][26].
For solving traffic assignment problems, numerous algorithms were developed based on link-based, path-based, and bush-based algorithms. e link-based algorithms operate in the space of link flows and do not require path storage, but it requires more computational efforts comparing to other groups of algorithms. On the other hand, path-based algorithms require explicit path storage in order to directly compute path flow equilibration and they have faster convergence comparing to link-based algorithms. e reason is that the link-based algorithm exhibits a zigzagging trajectory as it approaches the optimal solution. On the other hand, the path-based algorithm does not exhibit the zigzagging effect for the same initial solution. In the transportation literature [26,27], the computational performance of link-and pathbased algorithms for the traffic assignment problem is compared in medium-to large-scale randomly generated grid networks and real-size networks. Based on the conclusion, a path-based algorithm is consistently faster in approaching the neighborhood of the optimal solution. Bush-based methods are developed to solve an acyclic network problem. ese methods have improved the existing state-of-the-art of algorithms and are fast and memory efficient. However, using the algorithms for solving large-scale networks may remain impractical [28]. e solution algorithms for the traffic assignment model can be summarized below. Interested readers may refer to [27,29] for the detailed descriptions of the algorithm procedure.
e disaggregated simplicial decomposition algorithm proposed by Larsson and Patriksson [41] is similar to the restricted simplicial decomposition algorithm. However, the extreme points are generated by path flows instead of link flows. e gradient projection algorithm adjusts the O-D demand between nonshortest paths and the shortest path. e projected gradient algorithm is based on the idea of adjusting flow between the set of path with a cost greater than the average path cost and the set of path with a cost less than the average path cost. Improved social pressure is also using the path flow equilibrium between costlier paths and cheaper paths, but the flow updating strategy is more complicated. To solve the restricted subproblem, a greedy algorithm is introduced. Recently, Du et al. [40] introduced Barzilai-Borwein step size to adjust path flows.
is includes origin-based algorithm [42,43]; modified origin-based algorithm [44]; algorithm B [45]; linear user cost equilibrium [46]; and paired alternative segments [47]. e bush-based algorithm sequentially solves a list of node-based subproblems defined on the bush rooted. e algorithm B adjusts flows between the longest paths and the shortest ones in the bush rooted path tree. A paired alternative segment (PAS) is defined as two completely disjoint path segments, and then the flows are adjusted in disjoint path segments.
In this paper, we are interested in developing solution algorithms for solving the MP formulation of the CMSTA problem that can represent large-scale multimodal transportation networks with both private and public modes and correlation among multiple public modes using the nested logit (NL) model under the network user equilibrium (UE) framework. Contrast to the simple structure of the multinomial logit (MNL) model, the NL model has a hierarchical structure that decomposes the choice probability into two levels represented by the marginal and conditional probabilities. A two-phase GP algorithm is developed to solve the CMSTA problem with a hierarchical nested modal split structure and UE traffic assignment, along with various improvement strategies to speed up the convergence of the path-based GP algorithm.
In the literature, a path-based GP algorithm has been adopted for solving various traffic assignment problems such as UE assignment problem, the nonadditive cost assignment problem, side constrained problem, stochastic user equilibrium assignment problem, transit assignment problem, system optimal assignment problem, and freight traffic assignment problem [21]. Although the algorithm was developed for solving various traffic assignment problems, the algorithm was rarely applied for solving CMSTA problem.
Our goal is to demonstrate that the improved GP algorithm can solve large-scale multiclass or multimodal traffic assignment problems in CMSTA with NL function. In addition, the proposed algorithm can apply to other logit models having a hierarchical structure such as the generalized extreme value model by [48] including the cross-nested logit (CNL), paired combinatorial logit (PCL), and generalized nested logit (GNL). To show proof of concept, a large-scale multimodal transport network in Seoul, Korea, will be used to demonstrate the applicability of the two-phase GP algorithm. is paper is organized as follows. After the introduction, a relevant background of the NL model for the modal split problem and UE conditions for the traffic assignment problem is introduced in Section 2. Section 3 provides an equivalent MP formulation for the NL-UE CMSTA problem along with the equivalence property. Section 4 describes the two-phase GP algorithm for solving the NL-UE CMSTA problem, improvement strategies, and its overall solution procedure. Section 5 provides numerical results to demonstrate the applicability of the two-phase GP algorithm for solving large-scale multimodal transportation networks. Some concluding remarks are provided in Section 6.

Mode Choice and Route Choice
In this section, we provide some background of the nested logit (NL) model for the modal split problem and the UE model for the traffic assignment problem. A list of variable is provided first for convenience, followed by the NL model and multiclass UE model.

List of Variable.
Consider a transportation network G � (N, A), where N is the set of nodes and A is the set of links on a given network. We define the list of variables in Table 1.

Nested Logit Model for Mode Choice.
e NL model is widely used to represent mode choice in the literature (see, e.g., [49][50][51]). It considers the mode similarity through a tree structure as shown in Figure 1 (2) Note that if the parameter (τ rs u ) is equal to 1.0, the NL model reduces to the MNL model.

Multiclass User Equilibrium for Route Choice.
For the road-based transportation mode such as passenger car and bus, we have the Karush-Kuhn-Tucker (KKT) conditions at equilibrium as follows: Equations (3) and (4) define a user equilibrium flow pattern in the route choice [52].

Mathematical Programming (MP) Formulation
Without loss of generality, two types of transport modes (i.e., private vehicle and public transport) are considered in the upper nest while the lower nest can accommodate more than two modes (e.g., bus, metro, and bus-metro). Note that it can be extended to include more than two types of transport mode in the upper nest. Assuming that the number of public transport modes operated on a link in the road network is given (i.e., fixed number of buses on a link), the in-vehicle time of public transport mode (i.e., bus) is affected by the number of private vehicles on a link. In addition, the road capacity on a link is reduced as the number of buses increases. ese assumptions enable to present the excess demand function analytically. Consider the following MP formulation: s.t.

Journal of Advanced Transportation
e objective function consists of three terms. Z 1 is the well-known Beckmann's [53] transformation. Z 2 −Z 5 correspond to the NL model. Equation (5) shows the flow conservation constraints. Note that gc rs m consists of travel time, waiting time, and other personal utilities (e.g., fare, transfer time, and inconvenience).
To set up the generalized travel costs (i.e., the lower nest modes), we adopt a flow-dependent travel time for the auto mode and bus mode and a flow-independent travel time for the metro mode as follows: Travel times on each link are assumed to follow the Bureau of Public Roads (BPR) function.
where t 0 a is free-flow travel time of link a and η and β are given parameters.

Proposition 1.
e MP formulation in equation (4) through equation (5) gives the mode choice solution of the NL model. e Lagrangian of the equivalent minimization problem with respect to the equality constraints can be formulated as follows: where ϕ rs the dual variable. Given that the Lagrangian has to be minimized with respect to non-negative route flows and modal splits, the following conditions have to hold: Note that equation (20) presents the UE condition. From equation (9), we have where q rs A is auto mode demand q rs A � (q rs − m∈M rs u q rs m ).
Rearranging equation (11), From equation (12), we have the marginal probability for the auto mode as follows: From q rs m (q rs T ) (((1− τ rs T )/τ rs u )) � exp((θ rs /τ rs u )(λ − gc rs m )) in equation (11), we have the conditional probability as follows: ese probabilities are consistent with the NL model illustrated in the previous section for a case with two upper nests. is completes the proof.

Proposition 2. e MP formulation in equation (4) through equation (5) yields a unique solution.
For the proof of Proposition 2, we need to prove that objective function in equation (5) is strictly convex with respect to mode-route flow variables and the convexity of the feasible region. For the objective function, taking the Hessian matrix in equation (4) with respect to the mode-route flow variables gives which implies the positive-definite matrix when the link travel time (t a (x a )) is a strictly monotonic increasing function.
which also implies the positive-definite matrix. erefore, the MP formulation model has a unique solution. is completes the proof.
From equation (9), the excess demand function of the lower nest modes in the NL modal split function can be defined as follows (see Appendix for the detailed derivation): Hence, when these two costs are equilibrated (i.e., c rs k (f rs k ) � w rs m (q rs m )), the flows between the private and public transport modes achieve equilibrium condition. Figure 2 provides a graphical illustration of the equilibrated mode choice probabilities in the two-level tree structure of the NL model. ree modes are considered including auto, bus, and metro with their cost functions. e bus and metro share the same upper nest while the auto is an independent mode. Since the auto and bus use the road space, these two modes have a flow-dependent travel time. Meanwhile, the metro is assumed to have a flow-independent travel time according a dedicated right of way as presented in Figure 2(a). In Figure 2(b), the upper nest modal splits are determined, and then the estimated upper nest mode demands are used to determine the modal splits for the lower nest modes in Figure 2(c). For more detail, Figure 2(d) shows the change in travel cost and excess demand cost for the auto and two public transport modes. We can observe that the cost of auto increases when the transit mode flows decrease. However, the excess demand cost of the bus is not affected by the decreasing bus mode flow. e reason is that the excess demand cost of the bus consists of three terms (see equation (9)). As the transit mode flows decrease, the first terms and the second terms are decreased. Meanwhile, the third term is increased because the auto mode cost function is increased. is is according to the incorporation of the bus cost (i.e., gc B ). On the other hand, the excess demand cost of the metro is decreased with decreasing the metro flow. is is because gc M is not affected by auto mode cost. In Figure 2

Path-Based Gradient Projection Method
e gradient projection (GP) algorithm is a well-known path-based algorithm for solving various traffic assignment problems. Before considering the gradient projection (GP) algorithm for solving the CMSTA model (or NL-UE model), we provide a brief review on the ordinary GP algorithm for solving the UE traffic assignment problem with a fixed demand. Equation (18) shows the flow update equations [35] for solving a fixed demand traffic assignment problem. In each iteration, the algorithm finds a descent search direction by solving a shortest path problem and updates the solution by scaling it with the second derivative information.
where n is the iteration number; α is the step size; max{f, 0} denotes the projection of the argument onto the non-negative orthant of the decision variables (i.e., nonshortest route flows); and f rs k (n + 1) and f rs k (n + 1)are the flows on route k and shortest route k rs O-D pair rs at iteration n + 1.
Equation (19) shows the route travel time difference (c rs k (n) − c rs k rs (n)) and the scaling factor (s rs k (n)) to find the search direction.
where t a (x a ) and t a ′ (x a ) are the travel time and its derivative of the auto mode on link a. For a detailed description of the GP algorithm, readers should refer to [26,35]. e modifications for solving the Journal of Advanced Transportation 7 CMSTA model with NL modal split function and the improvement strategies for solving large-scale multimodal transportation networks are described as follows.

Two-Phase Equilibration.
Unlike the UE traffic assignment problem, the CMSTA problem considered in this paper (i.e., UE for route choice and NL for mode choice) requires not only equilibrating route flows of the auto mode for route choice but also equilibrating modal splits among multiple available modes (i.e., both private and public transport modes) following the NL model for mode choice.
Based on the hierarchical choice structure given in Figure 1, a two-phase equilibration method is developed to perform the equilibration for the two travel choices. In the first phase, GP equilibrates between the private auto mode and public transit mode using the minimum route travel time of the auto mode and the excess demand cost of the transit mode.
In the second phase, the lower nested probability function is applied to determine the mode demands among the transit modes. From the lower nested probability function, the excess demand cost function for the transit mode in the upper nest (see Appendix for the detailed derivation) can be derived as follows:     (20) with the minimum route travel time of the auto mode (c rs k rs ) and the excess demand cost for transit mode in equation (20). Figure 3 provides a graphical illustration of the first phase equilibration in the two-phase GP algorithm, which involves the modal splits of the upper nest (i.e., private car and public transit). If w rs T (q rs T ) < c rs k rs (see Figure 3(a)), route flows from the auto mode should be decreased accordingly to equilibrate the modal costs of both auto and public modes. On the other hand, if w rs T (q rs T ) > c rs k rs (see Figure 3(b)), route flows from the auto mode should be increased accordingly to balance the two auto and public modal costs.
Once the upper nest is equilibrated in the first phase (i.e., the auto demand ( k∈K rs f rs k ) and the public transit demand (q rs T ) are determined), the second phase is performed to split the public transit demand (q rs T ) into multiple transit modes (q rs m ) in the lower nest to achieve equilibrium in the mode choice. For the lower nest, the modal split probability function can be directly calculated as follows:

Column Generation Scheme. e disaggregate simplicial decomposition (DSD) algorithm proposed by Larsson and
Patriksson [41] is one of the earlier path-based traffic assignment algorithms. e algorithm consists of a linearized subproblem and a master problem to determine the next solution point. From the subproblem, extreme points (i.e., route generation) are resulted and then the master problem is solved with extreme points generated in the subproblem, which is equilibration. In the algorithm, the master problem is iterated until the route flows stabilized before generating new paths in the column generation procedure. Korpelevich [54] introduced a double projection method to relax the strongly monotone condition in the mapping in the variational inequality problem. From the predictor step, the strongly monotone condition is relaxed and the solution is adjusted to obtain a more accurate solution in the corrector step. Galligari and Sciandrone [55] recently proposed a fast path equilibration algorithm with an adaptive column generation scheme for each O-D pair. Lu et al. [56] used column generation scheme under the dynamic user equilibrium traffic assignment problem. Wang et al. [57] introduced the tolerance-based strategies for the column generation scheme in the dynamic user equilibrium traffic assignment problem. Ameli et al. [22,58] recently proposed trip-based dynamic traffic assignment and simulation-based dynamic traffic assignment. ese two algorithms adopt the column generation scheme for path generation. In addition, they assessed the computational performance of the inner loop methods in the first paper and proposed algorithm for solving large-scale dynamic traffic assignment in the second paper.
In this study, we apply a similar adaptive column generation scheme in the gradient projection algorithm. To improve the computational efficacy in the few iterations, we adopt the following condition in the inner loop (master problem) as the termination step: where 0 < c ≤ 1.0 is the given parameter. First, the convergence error in the outer loop is computed after the column generation step. en, the inner loop (equilibration) step is performed without the column generation step until it satisfies the termination condition in equation (22). e overall flowchart is shown in Section 5.
Another implementation improvement is the column dropping scheme. If the path set K rs (n) for each O-D pair rs contains relatively few paths, the path-based algorithms will perform satisfactorily without column dropping. However, the number of routes in K rs (n) can be very large for some situations (e.g., problems with mostly simple paths connecting each O-D pair, heavily congested networks, or multimodal transportation networks). Under these situations, it may be necessary to remove the inactive (zero flow or near-zero flow) paths from the route set to reduce the amount of calculation and bookkeeping needed in each iteration. Chen and Jayakrishnan [59] have shown that GP, which automatically achieves column dropping during the projection step, reduces computational time if the number of active (positive flow) paths is considerably less than the number of generated paths. In this study, we adopt the column dropping step suggested by Chen and Jayakrishnan [59]. Figure 4 shows the overall flowchart of the solution procedure. After the initialization step, column (or route) generation is performed. With a new path set, flow updated is active, and then restrict equilibration is performed without new path generation. If convergence error computed in restrict equilibration satisfies the inner loop convergence criteria, they go to column dropping step. e process is iterated until convergence error satisfies the outer loop convergence criteria. Figure 5 shows the detailed flow update procedure following the two-phase gradient projection algorithm.

Solution Procedure.
As described above, modified GP extends the equilibration of private vehicles (i.e., traffic assignment) to also consider equilibration of multiple transport modes (i.e., modal splits) by using a two-phase procedure which is the equilibration between public and private modes as the first phase and equilibration of modal splits among the public transport modes as the second phase.
In the column generation step, we adopt origin-based column generation to improve computation efficiency. See [60] for more detailed results. Based on the literatures (see, e.g., [26,35,61,62]), a unit stepsize (i.e., α � 1.0) is adopted for all iteration n since the second derivative information for an automatic scaling and the one O-D at-a-time flow update strategy is used in the equilibration procedure. is scheme has been found to be helpful in reducing computational efforts. In this paper, we also used unit step size (i.e., α � 1.0) with "one-at-a-time" flow update strategy. However, a line search step (e.g., self-adaptive strategies) can be used to determine a suitable stepsize to help better convergence, especially when highly accurate solutions are sought.

Numerical Experiments
e transportation network in the city of Seoul, Korea, is used to investigate the performance of the proposed twostage GP algorithm in solving the NL-UE model. e convergence characteristics are tested, and then parameter sensitivity analysis is examined.

Description of Seoul Network and Experimental Set Up.
e Seoul transportation network shown in Figure 6 consists of 425 zones, 7,512 nodes, 11,154 links, and 107,434 O-D pairs with a total of 2,874,441 trips. Table 2  Compute convergence error (ε inner ) ε outer < E ε outer < ε outer · γ q A n = 0.5 · q -; q T n = 0.5 · q - w (q T rs (n)) w (q T rs (n + 1)) c k rs (f k rs (n)) c k rs (f k rs (n + 1)) c k rs (f k rs (n+1)) f k rs (n) w (q T rs (n + 1)) q T rs (n + 1) q T rs (n) f k -rs (n + 1) q T rs (n) -Σ kєKrs f k rs (n + 1) (α (n)/s k rs (n)) (c k rs (n) -w (q T rs (n)))  loss of generality, the metro mode is assumed to provide enough operation capacity such that there is no congestion in this mode. One hundred maximum number of inner loop is set, and c value is set 0.1. e tolerance error of the relative gap (RG) shown in equation (23) is set 1E-8: RG � k∈K rs f rs k (n) · c rs k (n) − min c rs k rs (n + 1), w rs T (n + 1) + q rs T (n) w rs T (n) − min c rs k rs (n + 1), w rs T (n + 1) f rs k (n) · c rs k (n) + w rs T (n) · q rs T (n) . (23) e improved GP algorithm is coded in Intel Visual FORTRAN XE and runs on a 3.60 GHz processor and 16.00 GB of RAM. Figure 8 shows the convergence characteristics using the method of successive averages (MSA), Evan's algorithm, ordinary GP algorithm, and improved GP (IGP) algorithm. e method of successive averages (MSA) is the most widely used method in the predetermined line search scheme [52]. It predetermines a diminishing step size sequence such as {1, 1/2, 1/3, . . ., 1/n}. On the other hand, Evan's algorithm [2] is a classical algorithm for solving the combined distribution and assignment (CDA) problem as well as the combined modal split and traffic (c)  exp ((θ rs /τ u rs )(-gc m rs )) τu rs + exp ((θ rs /τ u rs )(-π rs )) τu rs Σ assignment problem (see, e.g., [20]). e results reveal the slow convergence of the MSA and Evan's algorithm. e GP algorithm presents a slower convergence than the IGP algorithm. With the GP algorithm, the solution cannot reach the highly accurate level of a RG of 1E-8, while the solution with IGP is reached to RG of 1E-8 in 3,100 seconds. Note that the tolerance error of 1E-8 is much stricter than the typical one (i.e., 1E-4) required in practice [64].

Convergence Characteristics.
In the following analyses, we conduct the sensitivity analysis with respect to the inner loop parameters c. e parameter adjusts the stability of path flows before generating new paths in the column generation step. Figure 9(a) shows the convergence with different values (0.1, 0.5, 0.8), and Figure 9(b) provides active number of inner iterations. Note that the maximum number of inner iteration is 100. As c value increases, the computational effort is significantly increased. When a higher c value is used, the required number of inner iteration is relatively smaller, but it requires more outer iterations. On the other hand, using a smallc value requires more inner iterations with the stabilized route flows, but it requires less the number of outer iteration. In Figure 9(b), we can observe that a smaller c requires more inner iterations. e algorithm is converged in the smaller outer iteration. After a few outer iterations, the inner iteration reaches the maximum inner loop. On the other hand, using higher c value is converged well in a few inner iterations, but it requires more outer iteration.    14 Journal of Advanced Transportation

Application of the CMSTA Problem.
is section examines the sensitivity of the IGP algorithm with respect to the nested logit parameters (i.e., θ rs and τ rs T ). Figures 10 and 11 present the effects of these two parameters on modal splits and average modal travel costs, respectively. Figure 10 shows that θ rs is sensitive to the modal splits between private and public modes (i.e., a 3% increase in the private car mode when θ rs increased from 0.1 to 0.133), while τ rs T is fairly insensitive to the modal splits. When the transit demand transfers to auto mode, the bus mode is affected more than metro and busmetro mode. is is because the travel time of the bus and auto is dependent. e result also shows that the modal splits are different albeit the ratio (θ rs /τ rs T �2.0) is the same. Figure 11 shows the link flow differences between case 1 (i.e., θ rs � 0.1 and τ rs T � 0.05) and case 2 (i.e., θ rs � 0.133 and τ rs T � 0.067). e green color shows link flows using case 1 are larger than those of case 2, while the red color shows the reverse.
Specifically, the figure shows the mean absolute error (MAE) and the root-mean-square error (RMSE) for assessing the link flow difference and average travel time (ATT) (i.e., total travel time/mode demand). Recall that the modal splits for case 1 are 72.19% and 27.81% for auto and transit mode, respectively, and the modal splits for case 2 are 75.30% and 24.70% for auto and transit mode, respectively. Because the modal splits are different using different parameters, the link flow pattern is also a different pattern as shown in the GIS map. e highest RMSE value is 164.16 in the metro mode, and the highest MAE is 99.51 in the bus mode. Albeit the link flow patterns are significantly different, the ATT values are similar values. is is because of the scaling effect in the large network. Although the ATT difference is less than 1.0 minutes between two cases, the total trip difference is 2,874,441 trips.

Concluding Remarks
In this paper, we presented (1) the CMSTA model with the nested logit (NL) model for the mode choice and the user equilibrium (UE) model for the route choice and (2) a modified-and-improved gradient projection (IGP) algorithm to solve the proposed NL-UE model. e proposed NL-UE model considered the mode similarity through the NL model under the congested network obtained by the UE condition. Specifically, an equivalent MP formulation for the NL-UE model was provided using a modified access demand to incorporate the two-level nested tree structure. On the other hand, the IGP algorithm was developed in such a way to perform a two-level equilibration. In the first phase, the upper nest of the NL model is solved. en, the lower nest of the NL model is applied with updated flows from the upper nest in the second phase. For the solution algorithm, the improved GP algorithm with inner loop equilibration procedure was introduced and shown the effectiveness in solving the combined NL modal split and UE traffic assignment (NL-UE) problem.
is is because the excess demand cost function consists of two logarithm terms. ese terms are sensitive to a small traffic flow change. A more stabilized path flow is required before generating new paths.
A real-size transportation network in Seoul, Korea, was used to demonstrate the applicability of the IGP algorithm for solving the proposed NL-UE model. e algorithm can conduct 1E-8 convergence criteria. e convergence time of the IGP algorithm seemed to be better than that of the ordinary GP algorithm. Further, the NL parameters have a significant impact on the convergence time and the mode share results.
For future research, the improved GP algorithm should be extended to consider mode (or vehicle) interactions with asymmetric cost functions. In addition, the route choice model for transit modes should be considered with transit information systems (e.g., arrival information). It would be interesting to see how the improved GP algorithm performs when more realistic features are incorporated into the CMSTA problem. In addition, a time-dependent model can be considered with travel activity patterns.