Bus Route Design with a Bayesian Network Analysis of Bus Service Revenues

. A Bayesian network is used to estimate revenues of bus services in consideration of the effect of bus travel demands, passenger transport distances, and so on. In this research, the area X in Beijing has been selected as the study area because of its relatively high bus travel demand and, on the contrary, unsatisfactory bus services. It is suggested that the proposed Bayesian network approach is able to rationally predict the probabilities of different revenues of various route services, from the perspectives of both satisfying passenger demand and decreasing bus operation cost. This way, the existing bus routes in the studied area can be optimized for their most probable high revenues.


Introduction
The bus network design problem has become one of the most common hard-to-solve issues of many cities in the world today.Unsatisfying bus services resulting from irrational bus routes make their revenues decrease.Therefore, a good design of a bus route is essential to satisfy the bus travel demand and improve the bus service revenue.Many methods for rationally designing a bus route or network have been continually developed.Ceder and Israeli [1] combined mathematical programming approaches and decision-making techniques to solve the transit network design problem.Chien and Spasovic [2] studied a grid bus transit system to optimize route spacing, station spacing, headway, and fare with the objectives of maximum total operator profit and social welfare.Lee and Vuchic [3] offered an iterative approach to solve the relationship between the variable transit trip demand and the transit network design, under a given fixed total demand.Wirasinghe and Vandebona [4] considered the express route planning problem based on both grid and nongrid road networks to minimize operating costs, passenger access costs, waiting time, and traveling time.Tirachini et al. [5] introduced a social welfare maximization model with the interplay between congestion and crowding externalities, with the aim of optimizing the design of urban bus routes.Ceder et al. [6] used a mathematical modeling approach which includes considerations of uneven topography to design the placement of stops along a single bus route.
The majority of researchers tried to minimize the total travel time, or the generalized cost.The genetic algorithm (GA), Tabu search, simulated annealing methods, and so forth have all played important roles in recent research on bus network design.Zhao and Zeng [7] combined GA and simulated annealing to minimize transfers with reasonable route directness while maximizing service coverage.Fan and Machemehl [8] provided a multiobjective model for considering the design of public transportation networks in the case of variable demand, and the solution methodology was based on the Tabu search method.Pacheco et al. [9] proposed an approach based on a local search strategy to solve the route design and bus assignment problem under the effect of different demands.Szeto and Wu [10] combined GA and a neighborhood search heuristic to simultaneously perform the suburban route design.Nayeem et al. [11] developed two versions of GA based metaheuristics to discuss the transit routing problem, with the aim of minimizing the travel time and the number of transfers simultaneously.
Inherently, the bus service revenues are usually sensitive to passengers, operating costs, and so on.Therefore, there is a strong need to develop a new approach from a comprehensive perspective that takes into account the dependency and uncertainty of the correlated factors of bus routes.A Bayesian network (BN) can be used to discover the overall dependency structure of a large number of variables under uncertainty and incomplete information and to predict future events such as traffic accidents in comparatively accurate manners [12][13][14].As a result, this research has newly proposed an approach based on BN to predict the probabilities of different revenues of various route services to satisfy passenger demand and decrease bus operation cost.This way, the existing bus routes in the studied area can be optimized for their most probable high revenues.
This paper is organized as follows.Section 2 introduces the study area represented as X here.The proposed processes of BN learning and probabilistic inference are given in Section 3. Section 4 presents the application of the methodology to the optimization of bus routes in X. Section 5 summarizes conclusions and comments.

Study Area
The proposed BN has been implemented on real bus routes in  which is displayed in Figure 1.Data was obtained from the administrative statistics of the government department.The routes in Figure 1 are mainly concentrated in the north area of .The uneven distribution of routes causes an uneven distribution of bus passenger demands, and it also causes traffic congestion which makes the traveling speed excessively low in the north area of .It is found that the ratio of bus routes whose average load factor is below 0.3 [15] reaches 25.00%.Most routes are relatively long and their daily passenger transport volumes are comparatively low, which makes their revenue per day often unable to meet their costs.Merely about 50% of bus routes are very detoured, which causes the average nonlinear coefficient of those routes to exceed 1.4.Because of its relatively high bus travel demand and unsatisfactory bus services,  has been selected as the study area of this research to explain the proposed BN approach.

Bayesian Network.
The model structure of a BN combining the principles from graph theory, probability theory, and statistics is known as a directed acyclic graph  defined by the respective sets of nodes and directed edges [16,17].The nodes represent random variables, and the directed edges represent relationships between variables.Associated with each node of a BN is a conditional probability distribution (CPtable) quantifying how much a node depends on its parent node(s) [18,19].
Formally, the directed acyclic graph  encodes conditional independence between the variables.Each variable   is independent of its nondescendants in the graph given the state of its parents.Thus, the joint distribution of  variables  1 , . . .,   which can be described by (1) is the product of the conditional distributions of each variable given its parent node(s).
where  represents the number of all variables,   is an instantiation of the set of parent node(s) of   , and   denotes the state of   .
The conditional probabilities that appear in the product form are stored in CPtable as the parameters.When the topology and CPtable are completed, Bayes' theorem [20] can be used as a tool for performing probabilistic inference which can be thought of as a message (e.g., the available state of a certain variable) passing process in the BN.The theorem is shown in where (  |   ) is the posterior probability distribution of   given   , (  |   ) is a prediction term for   given   , and (  ) and (  ) are the prior probabilities of   and   , respectively.A value that is assigned to a variable will be propagated through the network and will update the marginal posterior distributions of other nodes as explained above.

Determining Variables.
Before assessing the graph topology of a BN and the parameters of the CPtable explained in the BN, different variables represented by the nodes of the BN ought to be decided according to the characteristics of the bus routes in the research area and the availability of the data for the BN analysis.In this study, the revenue per unit distance (RUD) is used as the query variable to measure the operating quality of bus routes.The evaluation calculation is interpreted by where   denotes bus service revenues of route  (unit: Yuan RMB) and LEN  represents the length of route  (unit: km).
Besides the RUD, the nodes of the BN developed in this research also represent fitted-out vehicles (FOV), average daily passengers (ADP), traveling speed (TRS), average load factor (ALF), length (LEN), nonlinear coefficient (NOC), and average station spacing (ASS) for the study area.The FOV are the fitted-out vehicles of route  (unit: vehicles), the ADP represents the average daily passengers of route  (unit: passengers), and the ASS denotes the average station spacing of route  (unit: m).Other variables are explained by where LEN  represents the length of route  (unit: km), TT  represents the entire traveling time of route  (unit: h),   denotes the average daily passenger transport volumes of route  (unit: passengers),   denotes the rated passenger transport volumes of a bus of route  (unit: passengers), run  represents the daily schedules of route  (unit: trips), and SL  indicates the linear distance of route  (unit: km).
In this research, the values of all the variables represented by the nodes of the proposed BN for  are calculated from January to May 2016 according to the administrative statistics of the government department.The criteria of various discrete states of different variables are listed in Table 1.The discrete thresholds of the RUD, the ADP, and the FOV are based on the respective average values of the historical data.According to Sun et al. [21], 22.00 km/h and 30.00 km/h are selected as the discrete thresholds of the TRS.The discrete thresholds of the ALF are 0.30, 0.60, and 1.20 [15].According to CBIP [22], the discrete thresholds of the NOC are 1.10, 1.40, and 2.00 and those of ASS are 500.00m and 800.00 m.The discrete thresholds of the LEN are 36.23 km and 27.17 km [23].

Estimating Graph Topology and Parameters.
After determining all the variables represented by the nodes of a BN, the structure of the BN (i.e., the dependencies between different variables) and the parameters of the CPtable interpreted in the BN (i.e., the strengths of various dependencies as encoded by the entries in CPtable) can be estimated according to specific issues.A common and simple approach to the BN structure learning is to rank graph structures via a search-and-score method that measures how well each model structure fits the data to find the global optimization BN structure [24].Nevertheless, the search-and-score method requires prior information about the ordering of the nodes to reduce the search space [25,26].Unfortunately, this prior information about node ordering is not always given in advance.In order to make accurately efficient BN learning based on training data without adequate prior information, a directional dependence analysis (DDA) algorithm based on the dependency analysis of variables is proposed to learn the BN structure [26].The calculation steps of the DDA algorithm including the sequential steps of building undirected BN, orienting edges, and thinning BN are as follows.
3.3.1.Building Undirected BN.Make a list  to display every pair of distinct nodes (, ) with sufficient mutual information (, ) calculated by (7) over a threshold value .Sort (, ) in  according to the order of (, ) from large to small.Connect a pair of nodes by an edge.Remove the edge   with the maximal (, ) from  to list  and then remove the edge   including one of the nodes of the edge   from  to  with the conditional mutual information (,  | ) computed by (8) Add an edge between a pair of nodes belonging to (, ) remaining in  when they cannot be separated by a set of relevant conditional independence tests based on their conditional mutual information (,  | ) with respect to the set of "evidence" variables (i.e., condition set) .

Orienting Edges.
After building the undirected BN, the direction of an undirected edge   is oriented using the conditional relative ability to predict CRAP calculated in ( 9 According to the causal semantic theory [19], if CRAP (,  → ) − CRAP (,  ← ) >  ( > 0) (10) and no loops exist in the causal direction  → ,  is chosen as a parent node to .Similarly, if and no loops exist in the causal direction  ← ,  is chosen as a parent node to .The proposed method based on the causal semantic theory will not be able to orient all the edges in a network because of the weak causal relationship.In this research, the Minimum Description Length (MDL) principle [27,28] is used to deal with the weak causal semantic of arc -.

Thinning BN.
Find a minimal cut-set (, ) separating  from  in the BN graph using heuristic approaches [19].
Remove an edge  →  ( ← ) if (,  | (, )) is less than a certain threshold value.Iteratively examine the edges in the BN graph.
Received by virtue of applying the DDA algorithm with entering the statistical data of the variables represented by the nodes of the BN in this research, the structure of the BN newly developed for the estimation of bus service revenues of  is shown in Figure 2. In this study, the mean absolute percentage error (MAPE) is used to validate the accuracy of the developed BN model by calculating the average percentage difference between predicted values and observed ones given validation data [29].The result of MAPE of each variable in the BN is almost less than 0.10, which demonstrates the good robustness of the BN model.

Probabilistic Inference.
Given the structure of the BN and the parameters of the CPtable, inference queries can be evaluated by making full use of conditional independent information between different variables.Predictive inference (also called top-down reasoning) and diagnostic inference (also called bottom-up reasoning) are considered as two main types of probabilistic inference.They are based on the evidence nodes connected to the queried node through its parent and children nodes, respectively.In consideration of the predictive property of the probabilistic inference in this study, the Clique tree (CT) [13,30], which is one of the major approaches to inference in multiply connected BN, is applied for the top-down reasoning work in this study because of its accuracy inference computation and high computational efficiency.
The CT is an undirected tree.Each tree node consists of a set of nodes from the BN which is called Clique.The CT takes the mechanism of information propagation with the steps of distributing evidence and collecting evidence.
According to the difference of the information propagation programmes, the CT method can be further categorised into the Lauritzen-Spiegelhalter method, the HUGIN method, and the Shenoy-Shafer method.The HUGIN algorithm [31] is employed in this study because of its computational efficiency.The process of the HUGIN algorithm, which includes the sequential steps of clustering and propagation, is shown in Figure 3.

3.4.1.
Clustering.First, an initial moral graph   is constructed by making an undirected copy of the BN and then augmenting it as follows.Let  systematically range over all nodes in   .For each node , HUGIN adds to   an edge between each pair of nodes in parent nodes of  if no such edge already exists in   .Second, HUGIN triangulates the moral graph   , creating a triangulated graph    .The achieved triangulated graphs are proven to be the Cliques by Kjaerulff [32].Third, a CT is created from the Cliques by inserting separators.Last, the CPtable of the BN translates into potential functions of the CT by initialization.

Propagation.
The evidence information is introduced into potential functions of the CT.Find a query Clique   which includes the query variable (i.e., the revenue per unit distance) as the root cluster.In the collecting evidence step, information is propagated from the furthest Clique to the query Clique   until all the information reaches   .On the contrary, in the distributing evidence step, information is propagated from the query Clique   to the furthest Clique until all the information reaches each Clique.This way, the CT built on the basis of the BN achieves global consistency.Thereafter, the posterior probability distributions of the variables represented by the nodes of the BN are available from the prior CPtable explained in each Clique of the CT.Then, the conditional probability of the query variable   given the values of the evidence variables   in the BN is equal to where ∑   (  ,  Q ) is the marginal probability of   .

Results and Discussions
Three types of route services in the study area  are considered as examples here by the inference queries.The dashed line, the solid line, and the dot chain line shown in Figure 4 stand for route 26, route 28, and route 36, respectively.The red nodes represent the location of terminal stations of bus routes in .In this research, the locations of terminal stations in  are assumed to be fixed.The proposed BN enables us to achieve the maximum predicted probabilities of high bus service revenues corresponding to the certain states of the variables of route 26, route 28, and route 36, respectively.According to these states, the optimized layouts of the three routes can be designed, shown in Figure 5.The dashed line, the solid line, and the dot chain line stand for the optimized layouts of route 26, route 28, and route 36, respectively.The red nodes represent the locations of terminal stations of bus routes in .
Table 2 provides the maximum predicted probabilities of high bus service revenues and the values of variables for the existing routes and the optimized routes.The LEN, the NOC, and the ASS are calculated directly by the configuration of the optimized routes.The ADP is obtained according to the split flow of passenger transport demands of the existing bus routes which partly overlap the optimized routes.Then, the FOV is given based on the passenger transport demands of the optimized routes.The ALF is calculated by simultaneously considering the passenger demands and the fitted-out vehicles.The TRS is calculated by means of the traveling speed of the existing bus routes which partly overlap the optimized routes.The passenger demands of the west lines of route 26 are influenced by the split-flow effect, and the trend of the east lines of route 26 is inconsistent with the primary passenger flow.The comparatively low daily passenger transport volumes and the unreasonable layouts of route 26 make the probability of the existing high bus service revenues of route 26 become 0. Route 26 is adjusted by selecting the roads with comparatively few bus routes and stops with relatively large demands in order to increase the passenger transport volumes.Meanwhile, the number of operating vehicles is increased to satisfy the increasing amount of passenger demands.As a result, the revenues of optimized route 26 are the most probable high revenues.
Route 28 is mainly along the expressway which concentrates more than 40 bus routes.The concentrated routes split the passenger transport volumes of route 28.Moreover, route 28 is a relatively long and circuitous route.As a result, the probability of the existing high bus service revenues of route 28 is 0. In order to achieve the highest revenues, route 28 is optimized by selecting the roads with comparatively few bus routes and stops with relatively large demands to ensure larger passenger demands, while slightly improving route directness to ensure lower nonlinear coefficient and increasing the fitted-out vehicles to satisfy the increasing amount of passenger demands and control the average load factor in a reasonable range.
The concentrated routes cause low passenger demands of route 36 by the split-flow effect and make the traveling speed excessively low.Moreover, the route is more detoured, which increases the bus operation cost.The unreasonable layout of route 26 causes the probability of the existing high bus service revenues of route 36 to be 13.22%.Route 36 is modified by selecting the roads with comparatively few bus routes to increase the passenger demands and improve traveling speed, while increasing the number of operating vehicles to satisfy the increasing amount of passenger demands and control the average load factor in a reasonable range.Moreover, eliminating the circuitous alignment is going to ensure lower nonlinear coefficient.Then, this modification can successfully provide the most probable high revenues.

Conclusions
Using the area X in Beijing as the study area, a BN approach has been newly developed and applied to forecast the probabilities of different revenues of various route services, from the perspectives of both satisfying passenger demand and decreasing bus operation cost.This way, the existing bus routes in the studied area can be optimized for their most probable high revenues.In the future, combining other methodologies with the BN is worthy of the exploration for more rational estimation of the bus operating revenues.And extending the proposed solution methodology to solve other transport route design problems can be another future research direction.

Figure 1 :
Figure 1: The existing layout of bus routes in .

Figure 3 :
Figure 3: The process of CT.

Figure 4 :
Figure 4: The existing layouts of route 26, route 28, and route 36 in .

Table 1 :
The criteria of various states of different variables.
Figure 2: The structure of the BN developed for the study area.

Table 2 :
Maximum probabilities of bus service revenues.