Analysis and simulation of intervention strategies against bus bunching by means of an empirical agent-based model

In this paper, we propose an Empirically-based Monte Carlo Bus-network (EMB) model as a test bed to simulate intervention strategies to overcome the inefficiencies of bus bunching. The EMB model is an agent-based model which utilizes the positional and temporal data of the buses obtained from the Global Positioning System (GPS) to constitute: (1) a set of empirical velocity distributions of the buses, and (2) a set of exponential distributions of inter-arrival time of passengers at the bus stops. Monte Carlo sampling is then performed on these two derived probability distributions to yield the stochastic dynamics of both the buses' motion and passengers' arrival. Our EMB model is generic and can be applied to any real-world bus network system. In particular, we have validated the model against the Nanyang Technological University's Shuttle Bus System by demonstrating its accuracy in capturing the bunching dynamics of the shuttle buses. Furthermore, we have analyzed the efficacy of three intervention strategies: holding, no-boarding, and centralized-pulsing, against bus bunching by incorporating the rule-set of these strategies into the model. Under the scenario where the buses have the same velocity, we found that all three strategies improve both the waiting and travelling time of the commuters. However, when the buses have different velocities, only the centralized-pulsing scheme consistently outperform the control scenario where the buses periodically bunch together.


I. INTRODUCTION
The pervasiveness [1][2][3][4][5] of bus bunching has been a longstanding problem in public transportation systems. Bus bunching occurs when the distance between two or more buses reduces to near zero. In traffic studies [1][2][3][4][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21], bus bunching is often examined through the headway between two buses, i.e. the forward distance or time of a vehicle to its leading vehicle. Bunching occurs naturally as a consequence of the interaction between buses and passengers [2,4,5] after discounting the effects of other traffic. When a leading bus continually spends more time to pick up passengers than a trailing bus, distance between the two buses reduces as the trailing bus travels faster than the leading bus. This causes the buses to bunch. A sustained bunching leads typically to an increased average inter-arrival time between buses and an under-utilised capacity of the trailing bus. Indeed, when two or more buses bunch and move together, they serve essentially as a single unit. This increases the average inter-arrival time between the buses.
In a bid to improve the performance of bus systems, researchers and traffic planners have been proposing intervention methods to alleviate bus bunching. In the 70s, Refs. [1], [6] and [7] formed the first theoretical traffic is conveniently modelled as stochastic noise. In the case of a bus system, buses are represented as agents travelling with a cruise velocity. The velocity often follows a mean value measured from the real system and fluctuates within the measured standard deviation. Generality of such models comes at a cost of being less specific. While velocity of a bus does vary stochastically due to the random existence of other vehicles, certain slow down or speed up are however not random. Typically, buses slow down consistently on hairpin turns, upstream of junctions and before traffic lights. Moreover, certain traffic conditions can be specific to a particular bus route. More than often, such specificity plays an important role in influencing the system's dynamics that an intervention cannot be evaluated accurately by omitting the specificity. One can of course mark down and include all such special segments in the modelling of a fixed bus route, however, velocity of a bus depends also on other microscopic details such as driver's behavior and condition of the road. Inclusion of all these microscopic details increases complexity at the cost of generating key insights from the model.
In this paper, we explore the application of Monte Carlo approach to the agent-based transportation modelling. A Monte Carlo process provides an alternative approach to the modelling of velocities of buses on a fixed route. In the model, velocities of buses at different segments of the route are first collected from the real-world system of interest. Buses are then simulated as agents travelled at velocities drawn randomly from those empirical distributions of velocities at the corresponding segments of the route. In this way, the collective influence of the system's microscopic details is captured together with the stochasticity of the traffic.

II. EMPIRICALLY-BASED MONTE CARLO BUS-NETWORK (EMB) MODEL
The proposed Monte Carlo model is a data driven model which begins with a data collection process on details of the route and dynamics of the transportation system of interest. Here, we are interested in a closed loop bus service in a university's campus, specifically, Nanyang Technological University's shuttle bus system (NTU-SBS). At a resolution of ∆x = 7.5m, the bus route is mapped into a 1-dimensional space-discretized array of length L = 688 under periodic boundary conditions (see Fig. 1). M = 12 of the cells in the array are characterized as 'station' cells to model the 12 unevenly-spaced stations where the campus buses service. Positions of the stations are x = 62, 99, 152, 205, 256, 307, 353, 416, 479, 548, 606 and 667. We mark these as stops 1-12 respectively.
Buses move along the route and stop to pick up passengers at each bus stop. Depending on the time of the day, there could be one to seven buses serving the route at one time. Global Positioning System (GPS) coordi-nates of the buses are collected from 23 Aug 2018 to 11 Nov 2018 through the applet published in the web portal: https://baseride.com/maps/public/ntu/. These raw data were filtered and transformed into 1-D spatiotemporal trajectories i.e. x i,t , which are integers within [1,688] for the position of bus i at time t. Here, the data are collected at a temporal resolution of ∆t = 12s. Details of the data processing methodology and the spatiotemporal resolution can be found in Appendix A. Data from weekends and public holidays were excluded as they are not reflective of an average day of shuttle bus operation.
Velocities of the buses are calculated from the equation: v i,t = (x i,t+1 − x i,t )/∆t. For each of the 688 cells, we collect velocities of buses which pass by the cell. This gives 688 empirical distributions. The average velocity at which buses cruise on the particular segments of the road can be evaluated from each of the unique distribution at a cell. Figure 1 plots the average velocities on different positions along the route. The decrease of average speeds near the bus stops and road junctions are consistent with our heuristic understanding of the system. Figures 2a and 2b plot the empirical velocity distribution P x (v) at the road position where x = 542 and 61, which reflect road segments with the highest and lowest average velocities respectively. Based on our analysis, the average velocity of buses at position x = 542 is 31km/h, which is a reasonable average speed for a bus travelling within a campus. The collected distributions of buses' velocities are to be used in the Monte Carlo process to determine the instantaneous velocities of the buses. Specifically, for each time step t, bus i samples a velocity v ′ i,t from the empirical distribution which corresponds to the bus's position. To account for heterogeneity in the drivers' behaviors, v ′ i,t is multiplied by a dimensionless constant K i which scales the average speed of bus i. With that, for each time step t, bus i moves with an instantaneous velocity of v i,t = K i v ′ i,t . A systematic method of deriving K i from empirical data from NTU-SBS will be presented in Sect. III of the paper.
In our model, buses can occupy the same position on the route, allowing them to overtake freely. When there are passengers at a bus stop, passing buses will stop at the bus stop to allow passenger boarding. More than one bus is allowed to board passengers at a single bus stop simultaneously and when there are two or more buses, the passenger load is shared equally. When passengers on-board a bus have reached their destinations, the bus will stop to allow for alighting. If there is nobody on the bus who wishes to alight and no passengers at the bus stop waiting to board, the bus will continue travelling towards the next stop. In order to ensure that buses depart from the stop, there is vanishing probability of zero velocity in the empirical velocity distribution for bus departure at the bus stops. Details of the methodology can also be found in Appendix A.
As the data collected from the web portal does not include passenger boarding or alighting information, these information had to be estimated from the bus trajectories. It is hypothesized that passengers arrive at each bus stop infrequently (ranging from 20s to 3mins per passenger). As such, passenger arrival would follow a Poisson process, where the inter-arrival time of passengers would follow an exponential distribution of rate λ which describes the rate of passenger arrival. In our Monte Carlo model, passengers are injected to each bus stop j by a Poisson process of mean λ j . Each passenger injected at bus stop j will have a randomly assigned destination bus stop i such that i = j. After injection, each passenger remains at their respective bus stops until the arrival of a bus. When a bus arrives, passengers at a bus stop board on a first-come-first-serve basis. When the approaching bus stop is the destination of any on-board passengers, the bus will stop at the bus stop for passenger alighting. Alighting completes the journey for the passengers. The rate at which boarding and alighting takes place is determined by the parameter l which we set at 1s. Boarding and alighting are set to take place concurrently, which models how most buses have different doors for boarding and alighting. With this, the stoppage time at a bus stop is to be divided by the loading rate l to yield the number of boarding passengers.

III. MODEL VALIDATION
In order to validate our model, we split a day into 4 segments with breakpoints at 1000h, 1400h and 2020h. The reason for this split is that bus drivers in NTU shuttle services work in shift which typically start and end at these breakpoints. Such splitting also corresponds to the different passenger arrival rates throughout the day.
The inter-arrival time of passengers at bus stop j, ∆t arr j , can be deduced from the time-series of bus arrivals at the bus stop. This requires inputs on the time of arrival of bus i at bus stop j, T arr i,j , and the consequential stoppage time τ i . The departure time of this bus at  bus stop j is thus T dep i,j = T arr i,j + τ i . Analogously, the departure time of the previous bus, bus i − 1, at bus stop j is T dep i−1,j = T arr i−1,j + τ i−1 . Assuming a uniform passenger arrival rate, we expect where l is the loading rate and τ i l being the total number of passengers boarding bus i. From the ensemble of bus trajectories, we then form the empirical distribution of ∆t arr j for each bus stop j at each of the four time periods. We observe that the empirical distribution takes the form of an exponential distribution. The mean passenger arrival rate λ j is determined from fitting the respective empirical distribution to the exponential distribution. The results are tabulated in Table I. Note that this methodology assumes that all bus stoppages are solely due to passenger movement and that buses moves off immediately after all passengers have completed boarding and alighting. In reality, this is not true. As buses typically stop at bus bays, buses would require a clear traffic on the road before they are able to continue their journey. Also, we have observed that passengers who alight from buses typically cross the road in front of the stationary buses, further increasing its delay in moving off. As such, estimating the number of boarding passengers based on stoppage time results in an over-estimate in the derived λ j . The derived passenger arrival rates were analyzed against heuristic understanding of NTU-SBS. Temporally, the decrease in passenger arrival rate after 2020h is consistent with empirical observations as the majority of students and staff do not reside on campus. Spatially, bus stop 4 seems to consistently have higher arrival rates than other bus stops. This bus stop is located at the North Spine Plaza, where there is an aggregation of a variety of amenities. Notably, this bus stop is also located in the vicinity of the biggest library in NTU, and there is usually a significant number of library users up till its closing time at 2200h. This could lead to a relatively high arrival rate, even after 2020h. The highest recorded λ j occurs at bus stop 12 in the morning. This bus stop serves one of the largest residential halls on campus, which could explain the large number of passenger arrival rates in the morning.
From 13 weeks of segmented data, we select 256 sets of trajectories which start and end when either one of the buses in service ends its shift or a new bus joins the service. These trajectories last from 30 to 84 minutes, involving a fixed number of buses which ranges from 2 to 7. Note that none of the 256 sets of trajectories cross any of the breakpoints. With these, we are able to examine the validity of the EMB model under the conditions when the number of buses are constant, and the mean passenger arrival rates can be described by a set of λ j . As an example, one of the trajectories happen on 11 Sept, 14:04h -15:16h with three buses of average velocities 19.5, 19.7 and 21.3 (in units of km/h). These average velocities V i are used to estimate the corresponding K i via K i = V i /V , where V is the average velocity and it is determined from the empirical distributions across the entire route. The λ j to be used to simulate the trajectory is based on the start-time of the scenario (in this example, 14:04h). This start-time is matched against Table I to determine the corresponding passenger arrival rate λ j to be employed.
We simulate the 256 trajectories using the EMB model. Each simulation begins with N buses at their respective initial positions, with no passengers on-board. There are also no passengers at any of the M bus stops at this point. Preliminary analysis done by varying the number of initial passengers found similar simulation outcomes even when we start with a reasonable number of passengers on-board or at the bus stop. We then examine the phenomenon of bus bunching by analyzing the relative positions among the N trajectories. Specifically, we adapt the idea of oscillators synchronisation to characterize the phenomenon of complete bus bunching by means of the degree of synchronisation [4,27]: where θ i (t) is the angular position of oscillator i. When applied to our bus system, θ i (t) is the angular position of bus i. This quantity r 2 has values from 0 to 1, inclusive. The maximum r 2 = 1 implies complete bunching, where all the buses are bunched as a single platoon. On the other hand, r 2 = 0 describes a state where all the buses are randomly spread out across the whole bus route. Notably, r 2 = 0 does not necessarily describe an absence of bus bunching. Consider for example a case in which four buses bunch up in pairs. If these two pairs occupy antipodal positions, we have r 2 = 0. In that regard, r 2 = 1 serves as a condition for complete bus bunching, whilst r 2 = 0 can imply no bunching at all; or subsets of bunched buses where these subsets are spread out. Nonetheless, qualitative comparison between simulated passengers' total travelling time and the r 2 of the system (presented in Sect. IV) shows that r 2 is indicative of the performance of a bus system. In Fig. 3, we plot the simulated trajectories from a single realization of the EMB model for comparison against the corresponding empirical trajectories of 11 Sept, 14:04h -15:16h. As expected of a Monte Carlo model, the simulated trajectories do not match the empirical trajectories exactly. However, the simulated tra-jectories do capture a similar trend of complete bunching as the empirical trajectories, which is clearly illustrated by the simulated r 2 (r 2 emp ) and empirical r 2 (r 2 emp ) respectively.
In our comparison between the model and empirical data, we selectively aggregate the empirical r 2 (t) based on their initial r 2 , i.e. r 2 emp (0). In other words, we sort the 256 trajectories into 10 different bins based on their initial r 2 , i.e. r 2 emp (0) ∈ 0.0, 0.1 , . . . , r 2 emp (0) ∈ 0.9, 1.0 . Doing so reduces the statistical fluctuation in the empirical trajectories and highlights the general trend on how different states of bunching evolve over time from their initial values. The aggregated empirical trajectories (denoted as r 2 emp ) (Fig. 4, in blue) shows a general trend of r 2 increasing over time, especially for low initial r 2 . This demonstrates the propensity of the buses to bunch over time. This trend breaks down, however, for the cases of large initial r 2 . On average, r 2 emp trajectories which start with buses in an almost completely bunched state tend to unbunch over time.  (Table II) [18,28]. With exception of the r 2 emp (0) ∈ 0.8, 0.9 bin, the distance of all aggregated trajectories was found to be within the empirical horizon. This reflects a close correspondence between the dynamical behaviour found in the empirical system and that captured by the EMB model. In this section, the EMB model is used as a test bed for three classes of intervention strategies, with the aim of gaining insights into their effectiveness on alleviating bus bunching. Simulations are initialized with the buses at the equal-headway state and all simulations are only measured at steady state, where the statistics of the system remains stationary in the long time limit. The measurements include the r 2 (t) of the buses, the average waiting time and the average travelling time (defined as the sum of the waiting time and the total time spent in the bus) of the passengers. These measurements are compared against a control, which is a similar system that does not involve any interventions. To minimize the effects of statistical variation, the same set of passenger arrival rates are used across all the simulations.
Depending on the passenger arrival rates λ j and the velocities of the buses, we consider four different scenarios: 1. Lull-same -Low passenger arrival rate and buses have the same intrinsic velocity.
2. Lull-different -Low passenger arrival rate and buses have different intrinsic velocities.
3. Busy-same -High passenger arrival rate and buses have the same intrinsic velocity.

Busy-different -High passenger arrival rate and buses have different intrinsic velocities.
From the 256 empirical dataset, we identify 3 scenarios which match the descriptions above: • Lull-same -01 Oct 2018, from 21:41h to 22:33h; 2 buses with the same velocity of 15.6 km/h; average arrival rates across the bus stops is 0.011.
• Note that as there is no empirical scenario of a busy phase with buses travelling with the same intrinsic velocity, the busy-same scenario is a hypothetical variant of the busydifferent scenario which uses all the parameters of busydifferent scenario, but with all the buses having the same intrinsic velocity. A. Strategy formulation and its effects on the same-velocity scenarios

Intervention: Stop-based Holding
A simple holding strategy involves holding buses at a specified control point for a set amount of time, based on its current time-headway [22]. As a trailing bus tends towards bunching with its leading bus, this strategy holds the trailing bus such that it extends its headway from the leading bus. This produces a negative feedback which opposes the positive feedback mechanism of bus bunching. In a naive headway holding strategy, the holding time of bus i at time t is given by: where h t is the current time-headway of a bus, and h * t is a predetermined scheduled headway. For the purposes of minimizing bus bunching in our closed-loop system, the equal time-headway state is chosen as the scheduled headway in our study. This is given by h * t = H t /N , where H t is the average period of a bus around the loop, and N is the number of buses. We note that scheduling buses for holding strategy is an ongoing research area [28,29] and using a fixed time interval of H t /N is a simplification.
In most holding strategies, control points are set at some -if not all -of the bus stops [2,19,22,30]. In these strategies, h t is calculated by the time difference between the current time (t curr ) of the bus and the latest departure time at bus stop j, i.e. max(t dep,j ), before t curr . Mathematically, it is h t = t curr − max(t dep,j ). By applying this measurement in the naive holding strategy, buses only leave the bus stops when its headway is equal or greater than H t /N , i.e. h t ≥ h * t = H t /N . As this manner of measuring headway is based on when a bus leaves the bus stops, Ref. [18] refers to it as stop-based headway. In that regard, we refer to this holding strategy as stop-based holding strategy or s-holding.
Reference [13] modified the naive headway approach by multiplying the holding time by a constant α to compensate for unstable headway dynamics found in the real world. With that, holding time of bus i at time t would be:  Table II where α is a dimensionless constant. While [13] employs this parameter as part of a real-time control strategy, of which α is a dynamical variable which changes based on the current demand, we will limit our scope of study to static values of α. A holding strategy with α > 1 over-reacts to its current headway e.g. if α = 2, a bus which is 1 minute behind its scheduled headway will hold for 2 minutes. Conversely, α < 1 models an under-reaction to its headway. At the α = 0 limit, the bus has no reaction to its headway and does not hold for any amount of time. In this case, the system behaves as the control, where there is no intervention in place. On the individual level, holding strategies increase the travelling time of passengers on-board the held bus. However on a system level, the negative feedback would keep the bus system close to the equal-headway state, which would give rise to a lower average travelling time.
Results of lull-same scenario in Fig. 5a shows that the s-holding strategy is able to reduce the average waiting and travelling time of the passengers. In the absence of interventions (α = 0), the r 2 shows the buses sustaining a completely bunched state, resulting in a long waiting time of around 12 minutes per passenger. As α increases, the state of complete bunching decreases with both the waiting and the travelling time, with waiting time decreasing up to 50% and travelling time decreasing up to 20%. The difference in savings between waiting time and travelling time can be attributed to the mechanism of the holding strategy. When a bus is being held, the time passengers spend on-board the bus increases. However, this mechanism does not directly increase the waiting time of the passengers at any of the bus stops. In fact, it only decreases them, on average, as the buses are more evenly spread out.
As the s-holding strategy increases the travelling time of some of the passengers, one might expect that excessive holding -with large α -might result in a decrease in the system efficiency. Interestingly, s-holding performs well over a large range of α. By investigating the dynamics of the system at large α, we found that excessive holding time also results in a large extension of the bus headway. As s-holding strategy only triggers on buses with small headway, buses which are held with large α gain a large headway allowance, which leads to more infrequent holding. This trade-off between longer but more infrequent holding results in the system maintaining the average travelling time over a range of α. This relationship however, fails at extreme values of α, where one bus is held for a time longer than the effective period of the other bus. As this value represents a completely impractical strategy, it is not considered in our study. While larger values of α do produce an improvement in the average travelling time, traffic planners do need to consider the practicality of holding a bus for an extended period, especially when it affects the sentiments of the passengers. With that consideration, we recommend the smallest α in which buses do not bunch.
On the other hand, Fig. 5b plots the results of the s-holding strategy in the busy phase. From the control cases of α = 0, buses with the same velocity exhibit complete bunching at equilibrium, similar to the lull phase. However for α > 0, the waiting time of the passengers increases linearly with α, representing a complete failure of the s-holding strategy. Furthermore, the comparison between the uncertainty of the r 2 between α = 0 and α > 0 highlights a marginal increase in the degree of bunching for α > 0. This suggests that instead of alleviating bus bunching, the s-holding strategy in the busy phase could induce buses to bunch even more. This phenomenon will be further discussed in Sect. IV B, when we look at the effects of s-holding in the busy-different scenario.

Intervention: Continuous-time Holding
In the previous stop-based holding strategy, the timeheadway was calculated by time difference utilizing the departure times from the bus stops. Arguably, it is not the only way to define the time-headway of the vehicle. In fact, the concept of time-headway (or any vehicular headway in general) is not well-defined in the literature. In this regard, Ref. [18] proposed another definition of time-headway as the time taken for a vehicle to reach the current position of the leading vehicle. This was referred to as continuous-time time-headway.
The intricacy of using continuous-time time-headway is that it requires prediction to be made of a vehicles trajectory. As such, it has not been used as a viable measurement for most studies on holding strategy. Ref. [18] tested four different headway prediction methods for its continuous-time holding strategy, finding similar performance result between linear regression and extrapolation, kernel regression and extrapolation, artificial neural networks and autoregressive models. In this study, we propose another headway prediction method for continuoustime holding strategy, which is based on the input parameters (i.e. the empirical distributions) of the EMB model. Excluding the bus stops, the average time taken for a bus to traverse a site can be derived from the inverse of the average velocity at the site. This can be easily calculated from the empirical distributions. At the bus stops, the time spent would be equal to the stoppage time due to passenger alighting and boarding. As the number of passengers at a bus stop is dependent on its headway, by assuming all buses are in an equal time-headway state, i.e. h t = H t /N , the stoppage time at bus stop j is given by: where we have used H t = T 0 (1 + M j=1 λ j /l) and T 0 = L/v. Note that T 0 is the intrinsic period of the bus, L is the length of the route, andv is the average intrinsic velocity of the bus deduced from the empirical velocity distributions. Also, recall that M = 12 bus stops for NTU-SBS.
With that, Fig. 6 plots the time taken to traverse each site for different number of buses in the system. Based on the current position of both the departing bus and its leading bus, Fig. 6 can be used to determine the time taken for the departing bus to reach the current position of the leading bus. As with s-holding, the resultant headway is used to compute the holding time of the bus based on Eq. (4). As this strategy uses the continuous-time time-headway, it will be referred to as the continuoustime holding strategy, or c-holding for short. Figure 7 plots the effects of the c-holding strategy on the lull-same and busy-same scenarios. In both scenarios, we see that c-holding performs better than the s-holding strategy. Notably, the biggest improvement over the sholding strategy is that the c-holding strategy is able to keep the buses unbunched in the busy phase. The major difference between the two holding strategies is that the headway measurement in c-holding factors in the specific dynamics of the bus system (passenger arrival and bus velocity). This suggests that an effective holding strategy has to account for the specific dynamics of the buses, which is absent in the s-holding strategy. This hypothesis will be further analyzed after comparing the effects of both holding strategies in the busy-different scenario in Sect. IV B

Intervention: No-boarding
Unlike holding strategy, which effectively slows down a bus when it draws too near to the leading bus, a noboarding strategy serves to speed up buses which have their trailing buses too near [31], i.e, the backward headway h † d (or the trailing bus's h d ) is small. As the act of boarding passengers requires buses to stop and dwell at the bus stops, they can be 'sped up' by not boarding the passengers at the bus stops. While these buses will still stop for passengers alighting, this strategy reduces the additional dwell time due to passenger boarding, thereby allowing the bus to pull away from the trailing bus. In the ensuing analysis, the concept of 'speeding up' the buses by the no-boarding strategy and 'slowing down' buses by the holding strategy will be important.
In a recent work, Ref. [31] studied the effects of noboarding with extensive simulations, and provided an-  is greater than the threshold, the waiting and travelling times of the passengers increase drastically. Consider the no-boarding strategy in a system of 2 buses. If h † d,c ≥ L/2 with L being the distance of the whole route, boarding will be permanently disabled on at least one of the buses regardless of their configuration. In fact, if the buses are in an equal-headway state, both buses will not be boarding passengers. In general, the upper-bound of the threshold was found to be strictly less than L/N . Based on this, we limit the parameter h † d,c ∈ [0, L/N ] in our simulation studies.
As the distance-headway does not take into account the time taken to traverse different segments of the roads (as described by Fig. 1), using distance-headway might introduce significant inaccuracy to the no-boarding strategy. A more accurate approach could be to use the timeheadway of the buses. Similar to the distance-based approach, the time-based no-boarding strategy will disable boarding on buses that have a backward time-headway h † t smaller than h † t,c . The upper-bound is set at h † t,c = H t /N where H t is the period of an average bus. With that, Figs. 8 and 9 plot the simulation results of both the distancebased and time-based no-boarding strategies respectively.
By comparing Figs. 8 and 9, no significant difference was found between the distance-based and time-based no-boarding strategy. This refutes our hypothesis that there will be significant inaccuracy brought forth by using distance-headway. In our simulation methodology, the parameter h † d,c was studied by varying in step sizes of 0.05L/N . Based on the relationship between distance and time-headway from Fig. 6, the inaccuracy generated by the road heterogeneity accounts for 0.015L/N , which is less than the simulation step-size. This represents insignificant inaccuracy as compared to the parameter stepsize, which leads to the resulting similarity between the two strategies. Based on this, further study of the noboarding strategy will focus only on the distance-based no-boarding. Fig. 8a plots the effects of the no-boarding strategy in the lull-same scenario. Fundamentally, no-boarding is able to keep the buses in a staggered configuration (described by r 2 ≃ 0 in Fig. 8a). Nonetheless, if the threshold headway is too large (such as h † d,c = L/N ), no-boarding becomes too frequent and the consequence is a larger waiting time, even as the buses are kept staggered. The minimum waiting time occurs with a threshold headway of h † d,c = 0.82L/N , with a 37% savings in average waiting time and a 20% savings in average travelling time.
On the other hand, the busy-same phase in Fig 8b shows similar dynamics to that of the lull phase, such that the no-boarding strategy is able to improve the waiting and travelling time of the passengers. One interesting feature of the results is that there is an improvement (over the control) in waiting and travelling times of the passengers for h † d,c = L/N . This suggest that the upperbound of the threshold is larger than the theoretically predicted L/N .
The ensuing analysis of the simulated trajectories reveals that there is consistently a certain degree of localized bunching when the theoretical upper-bound is exceeded. As an example, we have Buses 1 and 2 of the 7 buses being locally bunched at angular position 0 • , Buses 3 to 7 taking 60 • , 120 • , 180 • , 240 • , 300 • respectively. As there are 7 buses, the threshold headway (in degrees) is 360 • /7 = 51.4 • . Since Buses 3 -7 have backward headway larger than the threshold, they are able to board passengers most of the time, and maintain their headway by executing no-boarding when their trailing buses pulls too close. Due to the presence of stochasticity, Buses 1 and 2 will likely not be in the exact same position. In the case where Bus 1 precedes Bus 2 by a small distance, Bus 1 will have a backward headway of approximately zero and Bus 2 will have a backward headway of 60 • (with reference to Bus 3). As such, while Bus 1 will not be able to pick up any passengers, Bus 2 will. This causes Bus 1 and Bus 2 to behave as a single bus, thereby resulting in a system where there are 6 active buses, maintaining a mutual headway (in distance) of L/6 which is modulated by a no-boarding strategy of threshold headway L/7. This results in a more efficient system when compared to the control (no intervention strategies) where all 7 buses will eventually bunch into a single platoon.
This finding is not a contradiction to the theoretical work presented in Ref. [31], which is based on buses having the same constant velocity. In the case where two buses of the same constant velocity are bunched, both buses will have (and constantly maintain) a backward headway of exactly zero. This causes both buses to be in a perpetual state of no-boarding. Consequently, these two buses will catch up with a preceding bus, where the minimized headway causes that bus to also execute noboarding. The cycle repeats with all other buses until none of the buses are boarding. The existence of stochasticity in our simulation study allows the system to continually exist in a localized bunched state, which allows high-threshold no-boarding strategy to perform better than otherwise predicted. This exemplifies a system where the specificity plays an important role in determining the system dynamics.

Intervention: Centralized-pulsing
In the theory of synchronization, self-oscillators can be synchronized by a periodic pulse [32]. As an example, clocks these days need not be expensively accurate. In-stead, a high-accuracy centralized clock can periodically send out electromagnetic waves to less expensive radiocontrolled clocks to periodically entrain them, such that the latter's accuracy is maintained by being synchronized with the former. This example suggests the idea of "entraining" the buses by a periodic force to prevent them from bunching together [4].
The new strategy is to subject the buses to a system of periodic driving forces that keep them in a staggered configuration. While the two previous strategies deal with the buses' interaction with passengers, this intervention involves the intrinsic behaviors of the buses. The strategy involves actuating each bus periodically (every ∆t p time steps) to either speed up or to slow down, based on their current headway. Specifically, each bus with a backward time-headway greater than the equal time-headway state of h * t = H t /N will be forced to slow down, and buses with h t < H t /N will be sped up. As with the previous two strategies, this effect acts as a negative feedback for the buses to maintain an equal time-headway state. In our model, speeding up is carried out by restricting the sampling of velocity from the upper median of the corresponding empirical velocity distribution of the sites the buses are at. Conversely, slowing a bus down involves sampling only velocities below the median of the distribution.
By varying the parameter ∆t p at which this mechanism is applied, we are able to vary the effect of the intervention strategy. To be consistent with the other strategies, ∆t p → ∞ represents a control, in which the buses are never nudged to move faster or slower. The maximum ∆t p = 100 represents actuating the buses to move above/below its average speed only 1% of the time. Conversely, ∆t p = 1 models buses being sped up/slowed down all the time. The practicality of this will be dis-cussed below.
The results presented in Fig. 10 agrees with our intuition that with a higher actuating frequency, the buses stay unbunched. Correspondingly, the interventions with large ∆t p have no effect on either the r 2 or the waiting/travelling times. As there are similar dynamics between the same-velocity and the different-velocity scenarios, the detailed analysis of the centralized-pulsing strategy will be presented with the results in the next subsection.

B. Effects of intervention strategies on different-velocities scenarios
Reference [4] discussed that in the lull-different scenario, the buses exhibit periodic bunching. This is the case where the fast bus periodically overtakes the slower bus and, as such, the buses are unable to maintain sustained bunching. A preliminary analysis found that a bus system exhibiting periodic bunching is a relatively efficient system. This can be seen in the simulation results of this section when α = 0 (e.g. in Fig. 11a) where the average waiting time of passengers is 7.4 minutes compared to 12 minutes when the bus velocities are the same (in Fig. 5a).
On the other hand, unlike the lull-different scenario, the high passenger arrival rate in the busy-different scenario does result in some degree of sustained bunching. However, due to the velocity differences, the dynamics are neither completely sustained bunching nor periodic bunching. The system takes on a complex dynamical state of buses bunching locally in smaller platoons, with the size and number of platoons fluctuating. An im-portant feature is that these "bus platoons" are scattered randomly along the bus route. The lower value of r 2 (α = 0 of Fig. 11b) is a result of platoons of bunched buses spanning the bus route and does not imply a complete absence of bus bunching. This is an example where r 2 = 0 cannot be interpreted as an equal-headway state when N > 2. Nonetheless, this complex dynamical state serves as a more efficient state than the completely bunched state, with waiting time averaging at 4.9 minutes (see Fig. 11b) instead of 10 minutes (see Fig. 5b). phase), the strategy backfires with the control performing better than any α > 0. At steady state, only the faster moving bus will have a small headway as it closes towards a slower bus. As such, only the faster buses will be held by the holding strategy. At equilibrium, the holding strategy effectively slows the faster bus by a fixed amount every round, with the amount depending on α.
With small α, the holding strategy slows down the faster buses by a little each round, but not enough to prevent periodic bunching. As there is no effect on the state of bunching, the sole effect of the holding strategy is in increasing the passengers waiting and travelling time. The least efficient system exists when α is large enough such that the effective velocity of the faster buses match that of the slower buses. With the buses having the same effective velocity, the bunching mechanism takes over and causes the buses to approach a bunched state (r 2 ≈ 0.8).
This results in the point where waiting and travelling time is at the maximum. The average waiting time of 11 minutes is comparable to the completely bunched state found in α = 0 of Fig. 5a.
By studying both the r 2 and the steady state dynamics, the holding strategy is able to completely nullify the mechanism of periodic bunching with large α, resulting in the buses maintaining a staggered configuration. The waiting time in this state is comparable to that of the control, suggesting that the periodic-bunching state and the staggered configuration are both equally efficient in keeping the waiting time of the passengers low. However, the travelling time at large α is marginally worse than the control. This can be attributed to the increase in travelling time of the passengers on-board the held buses. Similar findings were reported in Ref. [33] where there was an improvement in waiting time due to holding the faster bus to match the slower bus, albeit at the expense of an increase in average total travelling time.
In the busy phase, as with the results shown in Fig.  5b, the results of the busy-different scenario also show a linear increase in waiting time, which represents the complete failure of the system. With the no-intervention dynamics of the busy-different scenario averaging an r 2 = 0.37, it is apparent that the s-holding strategy with the increased passenger arrival results in a greater degree of bus bunching. While the mechanism of the onset holding-induced bunching is unclear, the monotonic relationship between α and waiting time can be explained by the steady state dynamics of the bunched buses.
The bunching dynamics starts with all 7 buses completely bunched at a single bus stop (also described by the r 2 ≈ 1). When there are no more boarding passengers, the leading bus (which is not held) leaves the bus stop while the other buses are held by the holding strategy.
The leading bus travels towards and stops at the second bus stop to pick up passengers. As all the 6 remaining buses are all bunched into a single platoon which trails the leading bus, the time between this leading buss arrival and the previous buss departure is very large. This results in a large number of boarding passengers at this second bus stop. As such, the leading bus takes a long time to board the passengers. Throughout our simulation studies, the holding time of the other buses is lower than the time taken for the leading bus to board all the passengers. Therefore, the other 6 buses would have finished holding at the previous bus stop and would then arrive at this second bus stop before the leading bus finished boarding all passengers. With 7 buses boarding concurrently, all the passengers would have boarded a bus and the leading bus would then depart for the next stop. This dynamics of one bus moving to the next stop and the remaining buses following after repeats at every single bus stop. As the speed of the platoon is dependent on the holding time, which is dependent on α, the waiting and travelling times of the passengers increase monotonically with α. The same set of dynamics occurs in both the busy-same and busy-different scenarios.
While the rules of any holding strategy would provide a negative feedback to the bus bunching behavior, the s-holding strategy in the busy phase results in a peculiar holding-induced bunching. Between the governing equation of the s-holding strategy and the simplistic scheduled headway, we conjecture that the holding-induced bunching can be attributed to its over-simplistic nature which does not take into account important attributes of the bus system. In the next part, we present a more dynamical holding strategy which takes passenger arrival rate and bus velocity into consideration. We will demonstrate that not only it achieves a more efficient bus system but also elaborate on how these attributes would interact with a holding strategy such that a failure to account for them could result in the holding-induced bunching.

Intervention: C-holding
In the lull-different case, the dynamics at low values of α < 1 is the same as s-holding, in which the c-holding strategy does not keep the buses in the staggered configuration. Unlike the s-holding strategy, Fig. 12a shows a marginal improvement in the efficiency at the range of α > 2, with a 14% reduction in waiting time, which is consistent with similar works found in Ref. [14,22,33].
For the busy-different scenario with the c-holding strategy, the r 2 in Fig. 12b suggests that the buses are kept at a lesser degree of complete bunching. However, this slight improvement in the state of bunching does not result in an improvement in the system, with average travelling time increasing slightly as α increases. While this implies an improvement over the s-holding strategy, the drop in efficiency suggests that the c-holding strategy does not quite achieve its function when the buses have different intrinsic velocities in a busy phase.
The improvement of the c-holding strategy over sholding can be attributed to the over-simplistic approach of the former. As the stop-based time-headway is measured as the difference of two successive buses departure time, it only ensures that the buses leave each bus stop at or after the scheduled headway. However, it does not consider how headway changes after the holding time. There are two mechanisms which can cause the headway of the buses to change: one involving the bus velocity and the other involving passenger arrival.
As discussed, when there is a difference in bus velocities, the strategy only holds the bus with the high velocity. Being the faster bus, it will always catch up with its leading bus after holding. The rate at which it catches up depends on the velocity difference between the two buses. To keep the buses exactly at the equal-headway state, a large velocity difference requires a longer hold-ing time than a small velocity difference. By considering the velocities of the buses, the c-holding strategy is able to keep the buses very close to the equal-headway state, suggested by the r 2 at large α in Fig. 12a. In contrast, the simplistic s-holding strategy in Fig. 11a cannot.
At the same time, a holding strategy possesses an inherent self-accentuating effect which is dependent on the passenger arrival rate. Holding a bus extends its distance from its leading bus, effectively increasing its timeheadway. An increased time-headway results in additional passenger arrival at the next stop, which in turn leads to a longer stoppage time. As with the velocity considerations, to keep the buses exactly at the equalheadway state requires a shorter holding time when the passenger arrival rate is high and a longer holding time when the arrival rate is low. As the continuous-time measurement of the c-holding strategy takes into account the velocities of the buses and the different arrival rates at different bus stops, it is able to consistently adjust towards an optimal holding time for the buses to converge to the equal-headway state. Conversely, as s-holding does not account for the effects of bus velocity and arrival rates, it is ineffective in maintaining the buses at an equal-headway state, which ultimately results in the buses bunching together.

Intervention: No-boarding
As with the holding strategies, the no-boarding strategy backfires in the lull-different scenario. However, there is a fundamental difference between the holding and noboarding strategies. In the holding strategy, the headway of the buses are modulated by 'slowing-down' the buses through holding. The parameter α determines how long each bus is held, which effectively scales the amount a bus is slowed down by the strategy. In contrast, the parameter h † d,c only affects the point at which no-boarding is invoked, but does not affect how much a bus is 'sped-up'. In the no-boarding strategy, the amount buses are spedup is dependent on the passenger arrival rate, i.e, the higher the number of passengers not picked up, the faster a bus will be relative to the trailing bus. Therefore, across the parameter range, the no-boarding strategy behaves like a holding strategy of low α such that the strategy increases the period of the periodic bunching, but does not modulate the buses enough to achieve a staggered configuration (see Fig. 13a). This is seen from the equilibrium dynamics of the slower bus being periodically overtaken by the faster bus, with no-boarding triggering when the latter is within its threshold headway. As the only change in the dynamics is the frequency of no-boarding taken by the slower bus, the waiting and travelling times of the passengers increase monotonically with h † d,i . The results of no-boarding in the busy-different scenario in Fig. 13b show marginal improvement in the waiting time of the passengers. This is consistent with the theory presented in [31] that a higher number of passen- ger arrival would result in greater degree of speed modulation required to keep the buses unbunched. While our simulation study only captures a marginal improvement in waiting time, Ref. [31] shows significant savings in waiting time when the passenger arrival rates are even higher (average k = 0.063, where k = λ l ). This result is consistent with our analysis and supports that no-boarding is a viable strategy for a very busy bus system, even with buses travelling at different velocities.
Intervention: Centralized-pulsing Comparing Figs. 10 and 14, it can be seen that the effects of centralized-pulsing across all four scenarios are very similar. A potentially important finding is the existence of a sharp transition between a completely unbunched and the completely bunched/periodically bunched state, which takes place consistently at ∆t p ≈ 30.
A further analysis was done by studying the system for ∆t p = 30 + . As discussed, our methodology involves initializing the system in the equal-headway state and measuring the system only in steady state. By investigating the transient state in the lull-same scenario, it is found that centralized-pulsing with ∆t p = 30 + maintains the initial equal-headway state for a long time before a sharp transition to the bunched state. This bunching persists in long time limit. This set of dynamics resembles physical systems which exhibit 1st order phase transition, where the system exhibits metastability within a particular range of parameter values [34][35][36][37][38]. At this range, the system could exist in the metastable phase for a long time. However, any perturbations would result in the system undergoing a phase transition into the stable phase. In this case, the equal-headway state is analogous to the metastable state and the bunched state is the stable state.
While centralized-pulsing appears to be a very promising intervention strategy, the practicality of implementation has to be considered carefully. As ∆t p is analogous to the 'inter-actuation time', ∆t p = 1 models buses consistently moving strictly above or below its average speed based on its current headway. For a bus with slow intrin- sic velocity, it has to move above its average velocity at all times. Based on the road or traffic conditions, this may not always be possible. In fact, the issue with actuating frequency relates back to the theory of the entrainment of self-oscillators. Discussed in [32], the onset of synchronisation by a periodic driving force is dependent on both the period and the magnitude of the driving force.
In this simulation study, the frequency is set by ∆t p and the 'magnitude of the force is modelled by the velocity selection rule. However, these values might not correspond to the real world. In fact, a reasonable nudging frequency and 'magnitude of the force' would not only be driverdependent, but also spatially dependent, i.e. there will be segments of the road that the driver can choose to drive a little faster or slower. To gather sufficient information to refine our implementation into a realistic and practical centralized-pulsing strategy, an experimental study measuring the bus velocities when prompting drivers to speed up and slow down should be carried out.

C. Comparison of the various intervention strategies
Having modelled the four different intervention strategies, Table III tabulates the effectiveness of each strategy by the fractional savings in the waiting time and travelling time over a control. Similar works in literature have found savings in waiting times ranging from 5% to 80% [8-11, 16, 20, 31, 33], which are in agreement with our findings. Among the three classes of strategies, a recurring finding is that centralized-pulsing stands out as the best-performing strategy. With this happening over all scenarios, the comparison between the holding and noboarding strategies will first be presented.
In the lull phase, both strategies generally only work when the buses have similar intrinsic velocities. While both holding strategies outperform the no-boarding strategy in the lull-same scenario, this marginal outperformance (compared against the uncertainty of the measurement) does not represent any significant benefit in choosing holding strategies over no-boarding for this scenario. In the lull-different scenario, none of the   strategies (except centralized-pulsing) provide significant improvement in the system, with c-holding strategy performing marginally better than the control case. By comparing the control case of the lull-same and lull-different scenarios, having buses with different velocities produce an average waiting and travelling time comparable to the no-boarding strategy. This implies that, unlike sustained bunching, periodic bunching is relatively efficient for a bus system. Conveniently, forcing buses to travel with different velocities during lull phase could be another intervention strategy to be undertaken by the bus operators. Alternatively, as smaller buses have a tendency to move faster than larger buses, bus operators could also employ buses of different sizes to increase the tendency of periodic bunching. In the busy phase, the holding strategies backfire in most cases. While c-holding could reduce bunching in the busy-same scenario, it requires a relatively long holding time (implied by the large α), which would affect its practicality. In contrast, the effectiveness of the noboarding strategy increases with the number of passengers; in the busy-same scenario, waiting time is reduced by 67%. While we find the no-boarding strategy to be less effective where there is a large variation in buses' velocity, it was discussed in Ref. [31] that improvement in efficiency in a busy-different scenario occurs when the passenger arrival rate is much higher. This shows that the no-boarding strategy is potentially an effective strategy in the busy phases.

V. CONCLUSION
In this paper, we propose the Empirically-based, Monte Carlo Bus-network (EMB) model, which is an empirical agent-based model designed as a test bed for potential intervention strategies to reduce bus bunching. Through that, we have studied three classes of intervention strategies. It was found that traditional intervention strategies (holding and no-boarding) only perform well under specific scenarios. Nonetheless, the no-boarding strategy seems to be viable when the passenger arrival rate is high. Of the five strategies that was studied, the centralized-pulsing strategy seems to be the most promising strategy as it was found to be effective across all the different scenarios.
As our theoretical studies have found potential solution strategies which would improve their respective systems, future work should strive to bring the solution from a simulation environment to reality. While applications of the EMB model have been presented exclusively on the NTU-SBS, the formulation is generic and can be applied to any other bus system. There are two benefits of applying the modelling framework to other empirical systems. As we have shown, modelling an empirical system using the EMB model allows the users to study and gain insights about the system, facilitating the development of intervention strategies. More importantly, applying the modelling framework to other empirical bus systems can facilitate the continual development of the EMB model. As an example, the interaction between buses and the other cars on the road have not been considered in the current formulation, due to the apparent infrequency of vehicular traffic in NTU's campus. However, if the EMB model is used to model a busy bus route, e.g. one which runs through the city center, the interactions between buses and vehicles become highly significant in determining the dynamics of the bus system. With added features, the EMB model may yield new insights into the modelling framework and even the phenomenon of bus bunching in general.  Table. IV illustrates the tabulated raw data collected from https://baseride.com/maps/public/ntu/. This includes a date/time stamp, vehicle registration number, and the longitude/latitude of each bus as it moves along a designated route. While the data only updates every 9-12s, the script was set to record the data in a 3s interval. This redundancy results in about 60% of repeated data which was duly processed and removed appropriately as indicated below. The 2-D latitude and longitude of bus i at time t is defined asx i,t . As the buses only move along a fixed periodic route, the positions will be mapped onto a discretized 1-D array, where x i,t is an integer which describes the position of the bus along the route. The following paragraphs will describe the methodology to transformx i,t → x i,t , and the subsequent filtering applied to the data.
The lower limit of spatial resolution was first determined by calculating the minimum δx i,t+1 , where δx i,t+1 = ||x i,t+1 −x i,t || forx i,t+1 =x i,t , from all the individual trajectories throughout the sampling period. The spatial resolution of the GPS data was determined by this minimum δx i,t+1 , which is 7.9m in length. Since a spatial resolution of 7.5m is typically used in discretespace traffic models [39][40][41] to reflect the length of an average vehicle and the interaction distance between vehicles, we employ the spatial resolution of 7.5m (instead of the empirically derived 7.9m) to allow for future modularity of the simulation, specifically when the simulation is expanded to include the effects of inter-vehicle interactions.
All the spatial coordinates collected in the sample period are plotted in a scatter, as illustrated in Fig. 15. A smooth line fitted along the scattered coordinates traces out the bus route. This line is then segmented at regular intervals equivalent to the designated spatial resolution Preliminary studies on the collected data found inaccuracy in the tick-by-tick update of the bus positions, even when taken at 12s intervals. Fig. 16 plots the occurrence of stoppages, i.e, δx = 0, where δx i,t = ||x i,t+1 − x i,t ||, along the bus route. While the probability of stoppage peaking at road junctions and bus stops are as expected, the median probability of stoppages is 0.2. This value corresponds to buses being stationary 20% of the time, even when traversing a seemingly uninterrupted road. Empirical observation during field work found that this does not reflect the average motion of buses. Further examination on the velocity distribution at single sites (as observed in Fig. 16) also shows the unrealistic propensity of zero-velocities occurring in the middle of a long stretch of road. A plausible explanation for this artefact is that the positioning data sent by the buses was interrupted for a significant period of time while the bus was in motion. As such, there is a possibility that the collected bus positions may not reflect the actual tick-by-tick positions of the buses. This ensures that buses move off when there are no more boarding or alighting of passengers.