Expression and Validation of Online Bus Headways considering Passenger Crowding

School of Automobile, Chang’an University, Middle Section of Nan’er Huan Rd., Xi’an 710064, Shaanxi, China Department of Transportation Engineering, Tongji University, Caoan Rd. #4800, Shanghai 201804, China School of Civil and Transportation Engineering, Ningbo University of Technology, Fenghua Rd. #201, Jiangbei District, Ningbo 315211, Zhejiang, China School of Automobile Engineering, Shaanxi College of Communication Technology, Wenjing Rd. #19, Weiyang District, Xi’an 710019, Shaanxi, China


Introduction
Standee density refers to the number of standing passengers in a unit effective area. It is an important indicator that reflects whether the bus selected matches the line and whether the headway is rational. e limit load of a city bus in Europe and the United States is 5-6 pax/m 2 [1,2], while the number of standees approved in China is 8 pax/m 2 [3]. According to surveys, the standee density in a bus in Xi'an city during peak hours reaches 9-10 pax/m 2 and often exceeds the threshold of 8 pax/m 2 ; however, in actual operation, this reduces ride comfort and overloads buses.
e standee density at various positions within the carriage is actually uneven, and sparsely populated areas at the front and in a rear aisle might affect the true standee density characteristics and impair sensitivity to changes in the standee flow. erefore, the number of standees cannot truly reflect the maximum passenger crowdedness. e position preference of passengers on each bus line can be determined. According to the passengers' preference for selecting a standing position, this study proposed some important areas to synthesize a standee density algorithm that could sensitively express the most crowded areas in buses to schedule the headway of the bus in real time [4]. Currently, to meet passengers' travel requirements, a drivers' workload on a bus line is fixed for a period of time by public transport enterprises in China, notwithstanding the operational cost. Likewise, the total scheduling frequency is also assigned based on the fixed workload on the line, but the headway varies; the fixed workload refers to that the departure frequency of each bus is constant to ensure the demands of operation. Public transport enterprises in China are totally state-owned enterprises. Regardless, if they are in debt, local governments pay it by the end of the year, as long as they guarantee the necessary services. Consequently, in this case, different headways per day have a slight impact on the operational cost. In the bus scheduling station, the dispatchers are unable to obtain information on the realtime passenger flow. As a consequence, many problems arise, such as personal judgment subjectivity in attendance and the headway not being rational to the bus line [5,6]. erefore, overcrowding of passengers often occurs. e main problem, therefore, is the allocation of online headways in the trough hour, off-peak hour, and peak hour.
Similar studies considered the operational cost and passenger waiting time as a balance index for determining the offline bus headway. e main objective of the present study was to determine the online bus headway without the operational cost; correspondingly, the passenger waiting time would surely be a unique index. As the arrival time of each passenger at the bus stop is random and the passenger waiting time is hard to determine online, this study proposed introducing passenger crowding instead of the waiting time. For online scheduling, first, the online passenger flow data obtained by the collector equipped on the front and rear doors of the bus were used. Second, the number of standees was obtained in real time, which represented passenger discomfort during the ride. erefore, the aforementioned problem led to another problem: determining the online headway according to the standee density.
Due to standee density at various positions being actually uneven, standee density in city buses cannot truly display the most crowded area. In the present study, the number of standees on the bus floor was allocated for each specified area to evaluate the most crowded area. A method for determining the bus headway was established based on the areas of higher standee density on the bus. is model was a pragmatic approach to improve the efficiency of bus transportation and increase the bus travel sharing rate of bus lines with large passenger groups in every city of China.

Literature Review
In recent years, considerable attention has been given to witnessing an increased interest in the model of scheduling frequency on public transport. Many bus scheduling models have been established based on offline passenger flow data, resulting in positive effects on the public transport quality of service. In terms of multidimensional analysis of passenger crowdedness, Tirachini optimized the scheduling frequency of subway vehicles considering passenger demand and the supply and operation of public transport [7]. From the standpoint of cost, passenger travel and operational costs were integrated into the newsboy model by Herbon and Hadas [8], who proposed the simulation results of the scheduling frequency of subway vehicles. e standee density is a multipurpose indicator used in pricing strategy, seat capacity, and scheduling arrangement [9]. A route planning and scheduling model has also been proposed based on passenger density and travel distance [10]. In particular, Jara-Díaz proposed an extension of Jansson's model for a single period based on the effect of vehicle size on operational costs and that of crowdedness on the value of time [11].
Assuredly, the impact of standee density on the bus design and travel cost was evaluated from different perspectives. Tirachini developed a social welfare maximization model with externalities of crowdedness, exposing the interplay between congestion and crowdedness in the design of bus systems [12]. e concept of passenger crowdedness involved sitting passengers and standees. It was a coordinating algorithm for the number of passengers in the carriage. In addition, the crowdedness cost had an internal relationship with passenger crowdedness by estimating the willingness of passengers to choose a moderately relaxed trip at different standee densities [13]. However, with a larger scale passenger flow, the standee density was related to the serviceability of the subway. erefore, a model for calculating the standee density was entrenched, and conclusive recommendations for its standard were proposed [14].
Furthermore, studies discussed the formation mechanism of standee density and key influencing factors in terms of bus door position and passenger preference in choosing a standing area. In addition, a crowd behavior control model was established, simulation studies were conducted at various crowd densities, and the results were used in the decision support tool of crowd control systems [15]. A follow-up survey proposed that door crowdedness was affected by multiple bus design parameters, including door placement, aisle length, presence of a front seating area, and service type [16].
However, the number of standees during morning and evening peak hours is significantly greater than that of sitting passengers in China, and standees have little chance of getting a seat on the buses. Passengers can get on and off a subway from the same door, although they are allowed only to get on from the front door and off from the rear door of almost all buses in China. Hence, the passing flow on board is difficult to determine, which is the root cause of unevenness in standee density [17,18]. A train mock-up was especially constructed to examine the impact of door width, seat type, platform edge doors, and horizontal gap on the time taken by passengers to board and alight [19].
Batarce explained that the public transport selection preference showed the application requirements of crowdedness cost, and a random discrete selection probability model was established [20]. Moreover, a baseline-category logit model for selecting standing areas was created considering the travel distance of passengers and the standee density in subways; it was also closely related to the door position [21].
In summary, many studies have reported the characteristics of standee density and offline headway. However, few studies have been conducted on the unevenness of standee density to define the most crowded area, aiming to establish an online bus headway model. Most of the aforementioned studies proposed the calculation method of standee density, determined its threshold, and analyzed the travel mode selection and cost-benefit issue based on passenger crowdedness [22]. us, these studies considered the operational cost and passenger waiting time to modify the offline bus headway. e present study proposed a model to overcome these challenges to determine the online headway in trough, offpeak, and peak hours according to the standee density.

Online Data Collection.
For better efficiency of getting off a bus, usually, the operation mode is paying the bus fare in cash or by a prepaid card without limit. Consequently, the data for the number of passengers getting off are lost. To overcome this problem, the passenger flow data collector was introduced. It automatically collected the number of standees getting on and off the bus at every bus stop to determine the online scheduling arrangement. e data collector consisted of an analyzer and two binocular camera sensors (Figures 1(a) and 1(b)). It used the human head calibration algorithm. e cameras installed on the front and rear doors of the bus collected video images (Figures 1(c) and 1(d)). e number of passengers getting on and off the bus was processed and transmitted to the monitoring host through the CAN system and then to the information processing platform via 3G/4G wireless communication [23,24]. e data collection system must be verified manually for the accuracy of passenger flow. After the system started to run, the data collector was arranged for 12 surveys on the bus, although 28 bus stops existed on the surveyed line (Table 1). e accuracy of data collection slightly reduces when the passenger flow is dense, but the accuracy can still reach 98.8%. A bus line in Xi'an has 20 buses equipped with data collectors, and an increasing number of bus enterprises have adopted binocular camera sensors to monitor passenger traffic in China.

Manual Survey Data Collection.
A manual survey was adopted due to the unevenness of standee density in each area and the inability of passenger flow data collectors to collect the number of standees in each area of a bus carriage.
is featured high precision but involved a high labor cost [25]. According to the stipulation in Xi'an, each bus is operated by 2 drivers for 6 round trips per day, and each bus line is equipped with at least 20 buses. When the bus reaches the highest speed between two bus stops, the standee density is relatively stable. Every two investigators were appointed to investigate the number of standees in the designated areas of one bus [26]. After a survey, 79 round-trip passenger flow data points were obtained in this study. e standees moved to the rear door in basically four areas ( Table 2).
Since Areas 2 and 3 were close to rear doors, the changes in the standee density in both areas were more sensitive than those in Areas 1 and 4. Although the cost of a manual survey was high, the number of standees in each area of the bus could be flexibly mastered [27]. Each area of the bus floor was measured during the manual survey process. Basically, the standing area within the wheelbase, Areas 2 and 3, was spacious.

Headways Based on the Standee Density.
e headway is closely related to the time of the first and last buses, the number of buses available, the scheduling task, and the trough, off-peak, and peak hours [28,29]. Generally, the public transport company's operation workload is fixed to ensure the operational needs of the bus line and the annual review of the vehicle production task [30]. For example, for public transport companies in Xi'an city, it is stipulated that the bus runs six round trips per day and three round trips per driver, and it is recorded as C. If a bus line has m buses and the ratio of the number of buses being repaired and rested per day to the total number of buses is d, then the scheduling frequency available to the dispatcher per day is Cm(1 − d) times [31].
Suppose that the time difference between the first and last buses of the line is T d (min) and that the parking time of the first and last buses is T c . e first and last buses are scheduled outside of T d − T c according to dispatchers, in which the scheduling frequency of the number of buses available for the dispatcher is Cm(1 − d) − 2. erefore, the headway of the bus line during off-peak hours η 0 is .
(1) e result obtained in equation (1) is actually the average headway of a day, which is the headway of off-peak hours. However, the actual online scheduling arrangement cannot be implemented only by this value, and it also needs to be processed based on the standee density, which means that η 0 will vary with the standee density. If the standee density is high during peak hours, the appropriate reduction should be made to η 0 , shortening the headways. Conversely, η 0 should be appropriately increased to reduce the frequency of scheduling in trough hours. e problem is that the standee density of each area shows unevenness, and a key indicator is needed to determine online headway according to the density of the most crowded areas. Actually, the areas on the bus floor designated for passenger seating and standing are limited. Standing in a spacious area is an instinct response of passengers. e wheelbase area is spacious enough and often more crowded than the other areas. e standing area is divided into four areas with different densities of standees [32]. According to the manual survey, Figure 2 shows the division of the standing area.
Defining the interstop as the bus line between every two stops, i(i � 1, 2, . . . , n) refers to the interstops, and j(j � 1, 2, 3, 4) refers to the designated areas. e premise of the standee density algorithm is the number of standees present in the interstop, which is described as follows:

Journal of Advanced Transportation
where Q i is the number of standees; Q ij is the number of standees in area j; ρ ij is the standee density in area j; Q ui and Q di are the number of passengers getting on and getting off the bus, respectively; Q a is the seating capacity of the bus, including one driver's seat; S is the total standing area supplied; and S j is the j area supplied, which comes from field measurement.

A Suitable Standee
Density for Scheduling. Standing areas are suitable as the key indicator for scheduling buses needs to be defined. According to the manual survey, changes in ρ ij were asynchronous. Figure 3 reveals the propensity of passengers to choose each standing area with a gradual increase in Q i .    is result indicated that if less standing space was available, passengers would seek another area to stand. When Q i did not exceed 18, Area 2 diverted the passenger flow of Area 3. When it was more than 45, the flow tended to divert to Areas 3 and 4. erefore, Areas 2 and 3 were critical positions and both were in the wheelbase area. erefore,Q i could not give the true expression of the crowdedness degree of the truly crowded area of the carriage, as Areas 1 and 4 were disruptive factors. erefore, factors unrelated to the standee density in the wheelbase area (SDWA) were excluded.
According to the manual survey, the discrete values of surveys ρ i2 and ρ i3 were positively correlated with Q i , but the growth trends varied in different Q i ranges. After nonlinear regression, the change rule of the discrete values of surveys ρ i2 and ρ i3 can be described as follows: e Pearson correlation coefficient was introduced to test the goodness of fit of ρ i2 and ρ i3 with the discrete value ρ sj of the corresponding area. e number of samples was N, and the correlation between ρ ij and ρ sj was expressed by the product difference correlation coefficient R j as follows: In the grade correlation coefficient R j level, the closer the absolute value of R j to 1.0, the greater the correlation [33]. When Q i ∈ (0, 9], Q i ∈ (9, 45], and Q i ∈ (45, +∞), the goodness-of-fit values tested were 0.997, 0.905, and 0.951 in Area 2 and 0.996, 0.996, and 0.970 in Area 3, indicating that the correlation between ρ i2 and ρ i3 with the discrete values of the survey was extremely good, and the goodness-of-fit test results were significant. ρ i2 and ρ i3 were not reciprocal independent indicators, and neither could fully reflect the true level of the SDWA. Although a weighted algorithm could be introduced to synthesize ρ i2 and ρ i3 , both of them were segmented functions of different distribution types. e difference between the SDWA values calculated with ρ i2 and ρ i3 conformed to the trend of the logarithmic curve when Q i ∈ (9, 45], and the SDWA algorithm was continuous at the turning points. As a result, based on the two indicators, the SDWA indicator was established as follows: where α and β are the weight coefficients of ρ i2 andρ i3 accordingly, and α + β � 1.0; c is the allocation factor of Q i . If c � 0, according to the definition of the continuity of the segmented function, considering Q i � 9 and Q i � 45 in the first derivative of equation (5), two weight coefficients were obtained, with α 1 � 0.68, β 1 � 0.32, α 2 � 0.72, and β 2 � 0.28. However, Q i calculated with the two weight coefficients was not continuous when Q i � 45. As the difference between ρ i2 and ρ i3 conformed to the logarithmic curve, when Q i ∈ (9, +∞), c ≠ 0. Moreover, when the first derivative of ρ i0 was continuous at Q i � 45, c � 0.08 could be obtained. In summary, ρ i0 could be expressed as follows:  Journal of Advanced Transportation Figure 4 shows the plotted curve of ρ i0 to test and compare the numerical stability of ρ i0 and the sensitivity to changes in passenger flow.
When Q i ∈ (0, 45], let ρ i0 � Q i /9, and Q i could be 39. Moreover, ρ i0 > Q i /9 in the range of Q i ∈ (0, 39). In the range of Q i ∈ (0, 45], due to the influence of Areas 1 and 4 on passenger flow, the larger the value of ρ i0 is, the more crowded the areas. In the range of Q i ∈ (0, 15], the first derivative greater than 0 indicated that the index had great volatility. When Q i ≤ 15, passengers were basically free to select positions and had a higher propensity for Areas 2 and 3. In summary, the judgment performance of ρ i0 as an indicator was better than that of Q i .

Judgment Logic of the Status on the Bus Line.
As the transit capacity and quality of service manual mentions, it is suitable for a new public transport system to define the peak hours by a passenger density of 2 pax/m 2 in America. However, the code for the design of metros in China recommends that the passenger crowding density should be within 5 pax/m 2 , and the proportion of interstops (referring to the section between every two bus stops) with a passenger crowding density exceeding 5 pax/m 2 should be controlled within 20% of the total based on ergonomics [3]. As a result, a statistical indicator was introduced. e proportions of the number of interstops λ k falling into ρ i0 ≤ 1, 1 < ρ i0 ≤ 5, and ρ i0 > 5 to the total number of interstops were taken as the statistical indicator to define the peak hour, off-peak hours, and tough hours online and avoid personal judgment subjectivity: where k � 1, 2, 3 refers toρ i0 ≤ 1, 1 < ρ i0 ≤ 5, and ρ i0 > 5 accordingly; a ik is the number of interstops based on k.
Based on the 79 round-trip passenger flow data points of the manual survey with large passenger traffic during off-peak hours in Xi'an, which were surveyed, the proportion of the number of stops classified was based on ρ i0 (Table 3).
When ρ i0 ∈ (1, 5], the corresponding proportion of interstops λ 2 fluctuated approximately 50%, indicating the index properties of the SDWA in the off-peak hours. λ 2 � 50% was set as the state judgment threshold. e real-time judgment logic of the passenger flow data collection system was proposed (Table 4).
Importantly, the threshold λ 2 � 50% is the reference value; it is necessary to determine the specific conditions of the bus lines. e aforementioned judgment result only applies to the bus executing its task during the operation period but does not indicate an increase or decrease in the extent of η 0 . Hence, it is also necessary to determine the online headways for the trough and peak hours.

Online Headways Based on the Standee Density.
According to the judgment logic of λ k and η 0 , the division of peak, off-peak, and trough hours was performed based on the key threshold of the proportion of interstops.
As λ 1 + λ 2 + λ 3 � 100%, interstops of proportion in trough hours λ 1 � 30% could be derived. However, the calculated η 0 , η peak , and η trough were not integers. Hence, using the rounding function INT(η) to integrate the original noninteger headway, the headways during the trough and peak hours were obtained. To achieve better passenger flow dissipation effects during peak hours and higher transportation efficiency during off-peak and trough hours, the rounding function was appropriately decreased for η peak and increased for η 0 and η trough .
During the peak hours, the headway η peak was shortened, resulting in η peak < η 0 , but η peak could not be shortened without limit. Equation (1) shows that when all buses were put into operation (d � 0), that is, no repaired or rested buses were present, the minimum value was taken. Hence, the online headway during peak hours η peak was as follows: where η min is the minimum headway and X 0 and X peak are the rounded decimal places of η 0 and η peak , respectively, with values greater than zero. e headway was prolonged during trough hours, resulting in η min > η 0 , but to meet the passengers' travel requirements and bus operation tasks, η trough could not be extended without limit. Equation (1) shows that when only 70% of the buses were put into operation (d � 30%), that is, repaired and rested buses accounted for 30% of the number of buses in the line, the maximum value was taken. Hence, the online headway during the trough hours η trough was as follows: INT η trough � η trough + X trough , where η max is the maximum headway and X trough is the rounded decimal place of η trough , with a value greater than zero.

Field Validation.
A bus line with the largest passenger flow in Xi'an city was considered as an example. It was used to verify the feasibility of the headway model and measure the parameter range of the headway model [34]. e time of the first and last buses on the line was 6:00 a.m. and 12:00 p.m.; the number of buses available on the line was 20; and the number of bus interstops was 24 in total. e bus had 37 seats, and S was 8.96 m 2 [35,36]. Moreover, 85% of the buses operated during off-peak hours. e passenger flow data were collected during the peak hours of a working day (Table 5). e calculation result of the SDWA was obtained from equation (6).
When Q i increased from 0 to 36, the maximum deviation rate of ρ i0 relative to Q i /S was 40.52%, which was due to the passengers diverting from Areas 1 and 4 to Areas 2 and 3. However, at this time, Q i increased slowly and was not sensitive enough to the standee flow. When Q i exceeded 36, the maximum deviation rate was −6.14% because Areas 1 and 4 diverted the passenger flow from Areas 2 and 3, alleviating the crowdedness of the SDWA. ρ i0 embodied the SDWA after the passenger flows of Areas 2 and 3 were diverted so that ρ i0 was slightly lower than Q i /S. erefore, it was more desirable to reflect the crowdedness of standees by using ρ i0 rather than Q i . e scheduling time length was 1080 min. As the number of buses during off-peak hours was 85% of the total number of buses, η 0 calculated using equation (1) was 10.8 min. erefore, the headway in the off-peak hours was 11 min. According to the values (Table 5), λ 1 , λ 2 , and λ 3 were calculated according to equation (5) to be 12.5%, 50.0%, and 37.5%, respectively. As λ 3 was the preferred judgment index and exceeded 20.0%, it was determined to be the peak hours of passenger flow, and the judgment conclusion was consistent with the judgment logic. At this time, the bus line dispatched all the buses into operation, and the minimum headway of the evening peak hours was 9 min.

Value Analysis.
Xi'an public transport enterprises have clear regulations on the daily running tasks of buses and the number of buses available for scheduling. Each bus runs six round trips per day, which is completed by two drivers. e number of buses available for scheduling should be maintained at more than 70% of the total buses. According to the aforementioned provisions, the minimum and maximum headways of buses can be calculated.
To avoid the phenomenon of dispatch overload on the bus line, that is, when d � 0, according to the data provided by the bus line, the lower limit η min of η peak was calculated to be 9 min. Let INT(η peak ) � INT(η min ). As the number of buses during off-peak hours was 85% of the total number of buses, it was obtained by equation (8).
When λ 3 approached 25% from 20%, that is, approached the scheduling load in the peak hours, the headway was scheduled by INT(0.2η 0 /λ 3 ).
When λ 3 exceeded 25%, the headway was still scheduled at 9 min to reach the bus scheduling load, and all buses ran on the line.
To meet the basic passenger's travel requirements and bus operation tasks, that is, when d � 30%, according to the data provided by the bus line, the upper limit η max of η trough was calculated to be 14 min. Let INT(η trough ) � INT(η max ); as the number of buses in the off-peak period was 85% of the total number of buses, it was obtained by equation (11).
When λ 1 exceeded 42%, the headway was scheduled to be 14 min according to INT(η max ).  Operational period Judgment logic Headway control Peak hours λ 3 ≥ 20% Shorten η 0 Off-hours λ 3 < 20% and λ 2 ≥ 50% Continue η 0 Trough hours λ 3 < 20% and λ 2 < 50% Increase η 0 e aforementioned calculations assumed the number of buses available on the line to be 20. If the number of buses allocated to the line could be increased on this basis, the value taking a range of λ k was eased.
By examining the real-time passenger flow data obtained from the collector, Q i of each interstop was given in real time. Taking Q i as a dependent variable, a more sensitive indicator, ρ i0 , was obtained. By the end of each bus in operation, the proportion of the number of interstops was calculated using the model. en, the proportion of interstops was used to determine the headway of the bus being set out. e headway model was simple, and it was easy to realize the automatic arrangement of the headway. e model was based on the standee density algorithm, which was more suitable for bus lines with variations in passenger flow.

Conclusions
e present study proposed the SDWA for determining the online headway; additionally, the feasibility of the method was verified by numerical examples.
First, after discussing the unevenness of the standee density on the bus floor, the SDWA was capable of sensitively reflecting the variation in standee flow. Passengers were more likely to choose the wheelbase area for standing if no seat was available. If the number of standees did not exceed 5S, the proportion of passengers who chose to stand in the wheelbase area surely exceeded 80%.
Second, the interstop proportion based on the SDWA exceeding 5 pax/m 2 should be given priority. Taking the proportions of interstops as the evaluation criterion for determining the headway of the bus being set out was surely objective and feasible.
Finally, as the arrival time of each passenger at the bus stop was hard to determine, using the proportion of interstops of the former bus to determine the headway of the buses to schedule online enabled the elimination of accidental factors. is method might be of great benefit to the bus headway, passenger evacuation, seat layouts, and emergency security.
Further studies should concentrate on evaluating the matching degree of seat layout and standee density to determine the criteria for guiding bus selection for public transport enterprises.

Data Availability
e data used to support this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest to this work and do not have any commercial or associative interest that represents conflicts of interest in connection with the work submitted.