Optimizing Bus Frequencies under Uncertain Demand: Case Study of the Transit Network in a Developing City

Various factors can make predicting bus passenger demand uncertain. In this study, a bilevel programming model for optimizing bus frequencies based on uncertain bus passenger demand is formulated.There are two terms constituting the upper-level objective. The first is transit network cost, consisting of the passengers’ expected travel time and operating costs, and the second is transit network robustness performance, indicated by the variance in passenger travel time. The second term reflects the risk aversion of decision maker, and it can make the most uncertain demand be met by the bus operation with the optimal transit frequency. With transit link’s proportional flow eigenvalues (mean and covariance) obtained from the lower-level model, the upper-level objective is formulated by the analytical method. In the lower-level model, the above two eigenvalues are calculated by analyzing the propagation of mean transit trips and their variation in the optimal strategy transit assignment process. The genetic algorithm (GA) used to solve the model is tested in an example network. Finally, the model is applied to determining optimal bus frequencies in the city of Liupanshui, China. The total cost of the transit system in Liupanshui can be reduced by about 6% via this method.


Introduction
It is important to determine the optimal transit frequencies when public transport issues, such as network planning or operation plan scheduling, are being decided.In these frequency determination models, passengers' costs and operating costs minimization objective is usually constructed.As to the basic data on transit network passenger flow, it is acquired by either the passenger counting method or a model of passenger route choice behavior [1][2][3][4][5].In these previous studies, the collected data have not taken the uncertain demand into account.However, passenger demand is actually uncertain, due to the effects of many factors, such as socioeconomic characteristics, population development, land use property, changing travel patterns, and emergency traffic incidents.And the complexity of the prediction process can also make passenger demand uncertain.Therefore, the results of transit frequency determining model will be more useful and robust if uncertain passenger demand is considered.Most studies [6,7] in road network assumed the uncertain travel demand can be described as a probability distribution function with mean and variance known.So, the primary objective of this study is to optimize bus frequencies with the eigenvalues of uncertain demand known.More specifically, the deducted process of bus frequencies determining objective by analytical method is the main focus of this paper.
Transit operation performance indices based on likelihood measures are more likely to be given attention in studies on transit network design that incorporate uncertain demand.For instance, it is assumed that the demand and running time of transit are stochastic.Then, systemwide travel time reliability, schedule reliability, and direct boarding waiting-time reliability are defined from the perspectives of the community, transit administration, and the operator and passengers separately [8].However, few researches have focused on other transit problem related to uncertain demand.In the aspect of road network, many traffic network design problems have investigated the uncertain demand.
For example, a genetic algorithm (GA) combined with a simulation technique has been used to solve urban road transportation discrete network design problems by considering travel demand variation and link capacity degradable [9].Sensitivity-based, scenario-based, and min-max models have been proposed to develop robust optimal improvement schemes that desensitize urban road transportation continuous network system performance to uncertain demands or allow the system to perform better in the face of a worst-case demand scenario [10].Provided that highways were built and are operated based on the BOT scheme, the optimal road toll price with demand uncertainty can be determined by considering tradeoffs among operators, road users, and the government [11].Regarding the above network design problems of capacity expansion or congestion pricing, the sampling simulation method is adopted to deal with uncertain demand.However, this method requires too much running time, and it cannot be applied to large networks.Therefore, several analytical methods have been proposed to counter weak solution efficiency.For instance, the point approximation method has been adopted to determine the near-optimal road toll price under uncertain demand [12], while this method has only been shown to provide a good solution in a small test network.Fortunately, an analytical method in the road network design proposed by the literature [13] has the advantage of efficient computation and large network application.Once the demand conforms to Poissons' distribution, the objective form of total travel time can be acquired by his method.Referring to this idea of objective deduction, an analytical method is proposed in the present study to determine the optimal bus frequencies.By the method of this paper, the uncertainty of demand can be propagated to flow according to the transit assignment probability, and then the total cost of transit system can be obtained efficiently with the uncertain flow by the analytical method.
In the remainder of this study, a bilevel model of transit frequency determination is formulated first.In the lowerlevel model, transit link's proportional flow eigenvalues are acquired based on optimal strategy transit assignment model.In the upper-level model, transit network cost is combined with transit network robustness performance (indicated by the variance in passenger travel time) as the upper-level objective (total cost of transit system), overall size of the bus fleet, and line bus number requirements make up the constraints.Then, the transit frequency optimization problem can be solved with GA.At last, this model is tested with cases of small transit route network and the Liupanshui transit route network in China.

Model Development and Analysis
A change in transit frequency can affect not only line capacity but also stop waiting time.Variation in waiting time will affect the passengers' route choice strategy.Therefore, in the previous frequency determination studies by bi-level model, the lower-level model fully considers the effect of frequency on passengers' route choice strategy.But the selected upperlevel objectives (such as passengers' total travel cost [2], the network cost composed of the passengers' total travel cost, and the transit's operating cost [5]) do not accurately reflect the transit network's robustness performance because uncertain demand is not considered.In this study, the lowerlevel model also considers the effect of frequency.However, the upper-level objective in this study will include transit robustness performance, in addition to transit network cost.We take the passengers' travel time variance, which can reflect the risk aversion of decision maker and make sure that a wide range of demands can be met by the optimal transit frequency, as the transit robustness performance.So, this upper-level objective can contribute to determine the optimal bus frequencies of the operating company under the condition of uncertain demand.
In the bi-level model, the bus frequencies are the output variables of upper-level model and are also the input variables of lower-level model, while transit link's proportional flow eigenvalues are the output variables of lower-level model and are also the input variables of upper-level model.

Lower-Level Model.
In this part, transit link's proportional flow eigenvalues are determined by two steps.First, the choosing probability of each link from each OD pair is acquired by the optimal strategy transit assignment model.Second, the mean transit link's proportional flow (expressed as mean flow on that link from one specific OD pair) and transit link's proportional flow covariance (expressed as the flow covariance between two links from one specific OD pair) is calculated by an analytical form.Then, by comparing with the simulated method in a small transit network case, the validation of the analytical transit link's proportional flow eigenvalues are studied.

Lower-Level Model Solution
Process.The optimal strategy model [14], applied for uncongested transit network, is the basic part of the lower-level model.Its solution algorithm is similar to the labeling method of the shortest path problem, which can achieve a high-efficiency solution process.This model should be based on an extended transit network topology, not simply composed of transit lines and bus stops.In the transit network extension process, each transit line segment between two neighboring stops should be divided into waiting, in-vehicle, and alight links.On the one hand, it can prevent the coexistence of more than one link between two neighboring nodes, which is the requirement of algorithm implementation.On the other hand, it can contribute to the construction of the variables and equations in the optimal strategy model.Next, to easily depict the running/waiting time and frequency at the extended links, we will briefly introduce the extension process.
A transit line  composed of two stops can be changed to an extended network with four nodes as shown in Figure 1.In the extended network, the dotted link stands for the waiting link, the solid link stands for the in-vehicle link, and the dashed link stands for the alight link.The variables from left to right in the parentheses represent the bus running time V   : the assigned flow on link  in the optimal strategy transit assignment model for reaching node ;    : the total frequency starts from the upstream node of link a, in the optimal strategy for reaching node ;   : transit travel demand from node  to destination .The demand of node  is defined as   = − ∑  ̸ =    to more easily represent the formula;   : the total waiting time of the passengers at node  whose destination node is .The expression is the total flow input at node  divided by the total frequencies on the transit links of  +  .These links belong to candidate links reaching destination .

Optimal Strategy Transit Assignment Model
Objective function (minimal expected travel cost of all the passengers): Transit link flow conservation constraint: Constraint of relation between the flow outputs at node i and the transit link flow of  +  : Nonnegative transit link flow constraint: The optimal strategy transit assignment model for reaching node  with the decision variable V   is shown in formulae ( 1)-( 4).Fortunately, the variable V   , assigned flow on link  from the OD pair , can also be obtained from the optimal strategy algorithm in the literature of Spiess et al. [14].
The choosing probability of travelling through link  from the OD pair  is presented as    .Once the bus frequencies are determined, the choosing probability    can be a fixed value unrelated to the passenger demands in the optimal strategy model.According to this specific characteristic, the choosing probability    can be expressed as follows: where    can be acquired easily.We only need to set   = 1, then V   will be the calculated value of    .Using the same method as the road network applied, the mean value   and the variance value   of uncertain transit travel demand   is obtained by survey in advance.Then, the mean (   ) of V   and the covariance ( and V   can be expressed in the following two equations, according to the properties of mean and covariance: When  = , the result of Cov(V   , V   ) is equal to the variance of V   .Then, formula (7) can be denoted as    = (   ) 2   .

Validation of Transit Link's Proportional Flow Eigenvalues.
The variables    and Cov(V   , V   ) are the input variables of the upper-level model; so we need to test its effectiveness.This subsection expressed the transit link flow eigenvalues as the function of transit link's proportional flow eigenvalues first.Then, its effectiveness can be tested by comparing the analytical transit link flow eigenvalues with the simulated transit link flow eigenvalues.
For the purpose of simple expression, all OD pairs are assumed to be independent of each other, and the passengers on each link come from many OD pairs.Then, according to the central limit theorem, the transit link flow follows a normal distribution, regardless of which distribution the demands follow.Consequently, the mean and variance of the transit link flow can be expressed as follows: The effectiveness of formulae ( 6) and ( 7) can be tested by computing the closeness between the analytical results of ( 8) and ( 9) and its corresponding simulated results.A small transit network of six stops was borrowed from the literature of Yu et al. [2] (shown in Figure 2) to validate the formulae ( 6) and (7).The bus running times for the transit links, transit line directions along stops, and bus frequencies are all shown in Figure 2. The independent OD demand follows a normal distribution, with the mean shown in Table 1 and a variance of 50.Several transit link flows are first obtained by the Monte Carlo sampling method.In this method, 1000 OD groups are generated at random; first the flows on transit links are assigned according to each OD group; then the normal distribution function can be fitted according to all the flows of each transit link at last.The variance of these flows, as shown in Table 2, is then determined according to the fitted normal distribution function.These datas can be used to verify the correctness of analytical results.In Table 2, the analytical transit link passenger flow, deviation degree (defined as |simulated variance − analytical variance|/simulated variance), and the number of OD pairs past the transit link are also listed.The transit link shown in the first line of the table represents the invehicle link from  to  that passed by line   .
Because the mean flows obtained by the two methods are almost equivalent, only the variances are compared here.It can be observed that the analytical variance and simulated variance are close.So the results of analytical method are validated.In addition, more OD pairs through a given transit link results in a smaller degree of deviation for this link.It indicates that the increase of OD pairs through one link will cause analytical variance being close to the true one, which complies with the central limit theorem.In urban transit networks, each transit link serves many OD pairs, and the analytical method is therefore more suitable for this application.

Deduction of Upper-Level Objective.
The SP survey presented in the literatures [15,16] indicates that the travel time and its reliability are the two most important factors influencing passengers' travel behavior.Experienced traffic engineers generally use travel time variance as an indication of travel time reliability, which can reflect both transit network robustness performance and risk aversion of decision maker.Besides, incorporating this term as a part of the upper-level model, the passengers' benefits can be reflected.Therefore, the upper-level objective in this study uses all passengers' mean travel time and its corresponding variance to represent the passengers' benefits.To consider the benefit of transit operating company, the objective also includes the term of transit operating costs.The upper-level objective is expressed as follows: The first subterm of (10) represents the expected travel time of the passengers; the second sub-term of (10) represents the total travel time variance; the third sub-term of (10) represents the transit operating costs.The meaning of relevant parameters is shown as follows: (⋅): this notation is used to solve the expectation value; Var(⋅): this notation is used to solve variance value; : the cost per frequency increases on transit line ; : the coefficient showing the importance degree of uncertain demand; : the coefficient showing the importance degree of transit operating costs.
Because the bus frequency is the variable of upperlevel model, the following work describes objective (10) as the function of frequency variable only.According to the formulae ( 6) and (7), the first sub-term of objective is expressed as follows after the analytical deduction: where variable    − represents the total frequencies which belong to the optimal strategy reaching the node  and starting from the upstream node of link .The mean transit link's proportional flow    from the lower-level model is known in this equation.So the bus frequency is the only where the transit link's proportional flow covariance Cov(V   , V   ) is obtained from the lower-level model.

Upper-Level Model Formulation.
After the above deduction of transit network robustness performance, the form of upper-level model can be expressed as follows.

Upper-Level Model
Objective function: It reflects the tradeoff between passengers and the transit operating company.The passengers expect there to be higher frequencies to reduce waiting time and improve travel time robustness, whereas the company tends to provide lower frequencies due to the operating costs: Constraint: the number of operating buses must not exceed the total number of buses: Constraint: at least one bus is guaranteed to run for each line: In the above constraints,   stands for the sum of the total delay time at stops and the recovery time at the terminal station; T represents the total number of buses in the transit operating company.

Solution Algorithm
The upper-level model provides the bus frequencies for the lower-level model, and the feedback data of transit link's proportional flow eigenvalues are given back to the upperlevel model.This lead-follower or Stackelberg game will not end until the upper and lower models reach equilibrium.Until now, the majority of bi-level model solving methods are heuristic.Specifically, when the objection function of the upper-level model is nonlinear, the feasible approaches would be classified into decomposition-based algorithm [17], gradient-based algorithm [18], and intelligent algorithm [19].A shortcoming is that the algorithm may converge to a local optimum solution, existing in the first and second types.Fortunately, the algorithm from the third type is mainly used to achieve a global optimum solution.Because of the complexity of this bi-level model, extensively used intelligent algorithm-GA (as shown in Figure 3)-is adopted to solve this problem.Its key modules are given in the following subsections.

Encoding.
There are many transit lines in the urban transit network.In this situation, the binary code of frequencies may reduce the efficiency of the searching procedure and also make the result inaccurate; thus it is not selected as the coding mode.Here, a real-coded scheme is selected to represent frequencies, and a population member is therefore described as follows: where  represents the number of transit lines.

Initial Population Generation.
To ensure that the frequencies in the initial population satisfy the constraints ( 14) and ( 15) and diversification, this study generates an adequate random number of buses per line and then changes them to the initial frequencies.The total number of buses is known as  in priori, the number of operating buses at line  ( ∈ ) is denoted by   ; then the frequency generation process of initial population is described as follows.
We will perform probabilistic method in the following loop to produce a series of random bus number vectors.First, for all the bus line  ( ∈ ), uniformly generate a random number of buses from the feasible integer set [1,   −  + 1]; second, give up this vector if the total bus number is larger than .This loop will not end until the required population size is satisfied.After this process, all these numbers are transformed into frequencies.This process can not only satisfy constraints ( 14) and ( 15), but also ensure the diversification of initial population  0 .

Fitness Computation.
In the GA searching method, the total cost of transit system is used as the fitness.If the constraint cannot be met under the corresponding individual, the fitness should be equal to a large number .Because the minimal fitness of the individual is the objective, this large number can be a penalty for the inappropriate individual.Then, this individual will be more easily eliminated in the genetic operation.If the individual can meet all the constraints, the optimal strategy transit assignment process for each individual should be conducted to determine the corresponding fitness value.

Genetic Operation.
The genetic operation adopted in this study is as follows: in the selection part, the Roulette Wheel Selection Method is used; in the crossover part, the two-point crossover method is used; in the mutation part, the Gaussian mutation method is used.All the above operators are described in detail by the genetic algorithm toolbox developed by Sheffield University.In addition, the termination criterion is that best fitness does not change during 10 successive generations.

Test Network
4.1.Small Test Transit Network.The case of Figure 2 is used to illustrate the effectiveness of the algorithm.The running time and OD demand of the aforementioned transit links are adopted here.The cost of each line per unit frequency is assumed to be 3000, which has been transformed to time cost (min), and therefore  = 1.For each line, the line delay is   = 0, and total number of buses is 32.According to the literature of [16], the magnitude of travel time variance is investigated to be higher compared to the expected total travel time.Additionally, the survey from [20] showed that the weight-coefficient ratio has an upper bound of 2.25.To be compromised, this study sets  = 1.5.For the genetic parameters, the number of initial population is set at 40.To rapidly create some fitness close to optimum fitness in the stochastic searching process, we set the crossover rate at a high value equal to 0.7.To satisfy Holland's schema theorem (denoted as a necessary condition for reaching optimum solution proposed by Holland [21]) that fitter individual can increase exponentially in successive generations, we set the mutation rate at a low value equal to 0.3.The punishment value  is set at 10 7 that is big enough compared to the size of objective value.
According to the above parameter values, the acquired terminal objective is 40,147 min, and the optimal frequencies are {0.69,1.15, 0.58, 1.11} veh/min.
When  and  are changed, the corresponding objective is shown in Table 3.
If  is fixed, an increase in  could lead to a decrease in operating cost and an increase in travel time eigenvalues.It indicates that if decision maker attaches more importance to operating cost, the more passengers' benefits will be removed.The reason is that under low operating cost, fewer bus frequencies would be generated, which could bring increased travel time to the passenger demand.
If  is fixed, an increase in  could lead to the opposite results of the previous situation.It indicates that the transit network robustness performance can be guaranteed at the cost of operating company's benefit.The reason is that the increasing of bus frequencies under higher operating cost could satisfy more uncertain demand.This data development trend could also indicate that varied risk aversion degrees (the amount of unsatisfied demand) may be reflected by giving different values to .

Test Transit
Network of Liupanshui City.The algorithm was applied in Liupanshui transit network whose peak-hour transit OD demand is acquired through the combination of household travel surveys and on-board surveys.Liupanshui is a medium-size city in western China, with an urban area of 293 km 2 and a population of 460,000, and its transit network is uncongested.The transit network is composed The computation time of the GA procedure takes around 113 min with processor of Intel Core i7-3520M CPU 2.9 GHz.Their convergence process is shown in Figure 5.The final resulting frequencies are {0.95,0.21, 0.28, 0.35, 0.28, 0.67, 0.9, 0.64, 0.4, 0.75, 1.07, 0.7, 0.57, 0.51, 0.18, 0.21} veh/min.There are three lines (1, 7, 11) possessing high frequencies.Reasons are listed below.
(i) The travel demands between east and west districts are high.Both lines 1 and 7 are serving the demands of eastwest direction.
(ii) Line 1 runs through central business district and line 7 passes the train station.
(iii) Line 11 is only 5 kilometers long and operates in the central business district.
Furthermore, compared with the current situation (shown in Figure 5) in the city, implementing the optimal solution from this study could decrease the value of the objective by about 6%.
The effect of uncertain demand coefficient  and fleet size  on objective is shown in Figure 6, where the mean travel demand is assumed to be fixed.
When the fleet size was fixed, the upper-level objective was upward-sloping with the increase of uncertain demand coefficient .It showed that the travel demand variance increasing can lead to the objective value increase (how the components of objective affected by the variation of  will be shown in Figure 7).
When uncertain demand coefficient  was fixed, the increase of fleet size  can generally reduce the objective value.But in the condition that a very small value was assigned to , the objective value was almost unchanged.It indicates that the uncertainty of demand has impact on the use of total buses.If  is small, only few buses need to be dispatched to improve the objective.In this condition, many buses will sit idle in the operating company.If  is large, more buses need to be scheduled to meet the uncertain demand.So, according to the value of , the adequate fleet size could be determined.If the previous field case is under consideration, the uncertain demand coefficient  = 0.2.Then from the Figure 6, we know that  = 280 would be the adequate bus number to not only prevent the waste of buses but also satisfy the uncertain demand.
The impact of uncertain demand coefficient  on objective terms is shown in Figure 7, where  was equal to 260.When uncertain demand variance  increased, the operating cost and travel time variance would increase simultaneously whereas the mean travel time would decrease.Why operating cost and travel time variance have the same trend of variability with the change of ?The reason is that the increased concerns on passengers' benefits would induce more operating cost to be spent at high bus frequencies.The reason of the reverse variability of mean travel time is that the   improvement of bus frequencies can reduce the total travel time of the mean passengers.

Conclusion
(1) This paper has analyzed optimal bus frequencies determination under uncertain demand by applying a bi-level model.The lower-level model assigns a mean number of passengers to the transit network and  acquires the transit link's proportional flow variance, based on the optimal strategy; the upper-level model determines the line frequencies, aiming to minimize the total cost of the transit system, subject to the constraints on the overall size of the bus fleet and line fleet.Uncertain demand, passengers' benefits, and bus operating cost from the company are all taken into account in this model.
(2) Results from a small test network shows that the GA method is successful in solving the bus frequencies determination problem, and this method was also applied in the uncongested city of Liupanshui in western China.The total cost of the transit system in Liupanshui can be reduced by about 6% via this method.
(3) Sensitivity analysis shows the relationship among uncertain demand coefficient, fleet size, and upperlevel objective.Particularly at low uncertain demand coefficient, any increase in fleet size causing no change in the upper-level objective.This rule would be beneficial for the determination of bus-buying plan.
(4) Future studies can be identified as follows.First, basic parameters such as the cost of each line per unit frequency should be determined with more accuracy.Second, scientific survey methods should be developed to acquire the eigenvalues of transit OD demand and the coefficient of total travel time variance.Third, the objective under a congested transit network should be deduced using the analytical method, which will be useful for optimizing bus frequencies of a metropolitan city with demand uncertainty.

Figure 3 :
Figure 3: GA for bus frequency determination with uncertain demand.

Figure 5 :
Figure 5: GA convergence process with crossover and mutation rate equal to 0.7 and 0.3.
ta in de m an d co effi ci en t F le e t s iz e Upper-level objective ×106

Figure 6 :
Figure 6: The effect of uncertain demand coefficient  and fleet size  on objective.

6 Figure 7 :
Figure 7: The impact of uncertain demand coefficient  on objective terms with fleet size  = 260.

Table 1 :
Mean matrix of transit travel demand.Figure 2: Small transit network.Because there are lots of parallel transit lines passing through the same road segment, the extended transit network will certainly create many links and nodes.It is difficult to show large elements in the picture; thus extension figure is not provided herein.

Table 2 :
Validation test of transit link's proportional flow variance.

Table 3 :
Objective analysis according to different parameter values.

Table 4 :
[9]nsformed time cost per frequency and line delay (min)., 270 stops, and 260 buses, and it resembles a long strip connecting the eastern and western ends of the city, which is shown in Figure4.It is assumed that the true OD demand follows a normal distribution with the obtained demand value as the mean.According to the literature[9], the standard deviation of demand can be set as the mean value multiplied by uncertain demand coefficient  (this case sets  = 0.2 as Xu and Gao's[9]set).Transformed time cost per frequency   and line delay   , whose values should refer to the line bus type and line length, are investigated by the bus operating company of Liupanshui and shown in Table4.Population number and other parameters have the same values as those of the former case.