Identifying Key Bus Stations Based on Complex Network Theory considering the Hybrid Influence and Passenger Flow: A Case Study of Beijing, China

. In the bus network, key bus station failure can interrupt transfer lines, which leads to the low eﬀectiveness of the whole network, especially during peak hours. Thus, identifying key stations in the bus network before the emergency occurs has a great sig-niﬁcance to improve the response speed. In this paper, we proposed a new method considering station hybrid inﬂuence and passenger ﬂow to identify key stations in the whole bus network. This method aims to measure the inﬂuence of bus stations while combining the topological structure of the bus network and dynamic bus stations passenger ﬂow. The inﬂuence of bus stations was calculated based on the local structure of the network, which reﬁnes from ﬁnding the shortest paths with high computational complexity. To evaluate the performance of the method, we used the eﬃciency of the network and vehicle average speed at the station to examine the accuracy. The results show that the new method can rank the inﬂuence of bus stations more accurately and more eﬃciently than other complex network methods such as degree, H -index, and betweenness. On this basis, the key stations of the bus network of Beijing in China are identiﬁed out and the distribution characteristics of the key bus stations are analyzed.


Introduction
A well-developed public transport system can effectively alleviate urban traffic congestion. e bus network is a crucial component of the public transport system and has a great significance to urban traffic. With the rapid development, especially in big cities such as Beijing, China, and New York, USA, the bus system has become too complicated to evaluate. Taking the bus network of Beijing, China, as an example, there are 886 lines and 25424 bus vehicles. Besides, the length of all bus lines is up to 19290 km. e annual passenger volume of the bus has reached 3.36 billion people, and the mean passenger volume is 9.73 million one day. Under the highly complex transit network, how to quickly find influential stations and analyze the topology of the bus network has become a hot topic of current research.
With the small-world network [1] and a scale-free network [2,3] being proposed, a complex system has made big development. e main function of a complex network is to transform the relationship of each element of the real system into the nodes and edges, emphasizing the topological structure of networks [4]. erefore, the structural characteristics of the network play an important role in understanding the function of a complex system. For instance, the topological structure of the social network affects the spreading of rumors information [5]. To sum up, complex network theory is a powerful tool in various fields, such as biological networks [6][7][8], social networks [9][10][11], and transportation network [12][13][14].
In terms of transit networks, it is a significant approach to acquire the topological structure using complex network theory. In the past few decades, a lot of scholars have applied complex network theory to carry out related research studies on urban public transportation networks. Meanwhile, some indices have been proposed to measure the topological characters containing a degree [15], clustering coefficient [16], and betweenness [17,18]. Sienkiewicz and Hołyst [19] analyzed the topological structure of public transportation networks in 22 cities of Poland and found that all networks have small-world characteristics and the degree distribution of the network obeys power-law distribution or exponential distribution. Ping et al. [20] studied the bus network of 10 cities in China and found that the bus network shows robustness to random attacks and frangibility to malicious attacks on a global scale. ese studies are useful to find macroscopic properties that are not yet recognized in the real complex system. e scale-free and traffic concentration problem of the network is one of the basic scientific issues in a complex network.
Moreover, once key nodes in the network are damaged, it will quickly spread across the whole network [21,22]. For example, Jovica proposed that a failure initiated with one critical system could cause cascading effects within the whole system-of-system in a coupled infrastructure system [23].
us, there are more and more correlation studies on the efficiency of the whole network when some components of the system failed [24][25][26]. However, the transportation system is a complex and uncertain system, which is vulnerable when it suffers from bad weather, road construction, and traffic congestion [27,28]. Bus stations are one of the elementary components of transit networks. Key bus stations play an important role in the operation of the network. e failure of key bus stations could cause large-scale traffic congestion and even paralyze the traffic networks. erefore, identifying and analyzing the key nodes is critical to the whole transit network, especially during peak hours.
Recently, many types of research focused on identifying key nodes of transit networks. Chen et al. proposed normalized average distance (NAGD), normalized average minimum speed (NAMS), and normalized largest component order (NLCO) algorithms to elevate the vulnerability of networks and help identify the key nodes in networks [29]. Using the supernode graph structure of London city, Shanmukhappa proposed a spatial amalgamation to identify influential nodes in bus transport and metro transport networks [30]. Sun et al. used the z-score and participation coefficient to analyze the influence of nodes in the Dublin bus network [31]. Zhang et al. studied an approach to extract the hub network from the urban subway transit network, revealing that the hub network is hierarchical. At the same time, transfer stations also played a critical role in key networks [32]. Xu et al. created a weighted network for public transit of Beijing based on key property theory and analyzed the scaling laws and correlations [33,34]. Moreover, the other approaches for identifying key nodes list as follows: proximity prestige [35], structural holes [36], degree centrality [37], K-core [38], and betweenness centrality [39].
To best of the authors' knowledge, the studies on identifying key bus stations mainly focus on the topological structure, and the topological evaluation index is treated as unrelated or single, such as degree, H-index, and K-core. In reality, the bus network is a typical dynamic complex network, which not only has the general characteristics of complex network but also has the remarkable feature, such as the travel behaviour of passengers. Generally speaking, passengers tend to travel in areas with active of socioeconomic activities, such as CBD, and transfer hubs. Hence, in addition to considering the connectivity characteristics of physical structure, the travel behaviour of passengers and passenger flow demand of stations should be considered in the bus network. Meanwhile, the passenger flow of bus stations presents the following features [40]: (i) there are two rush hours in the morning and evening on each working day, separately; (ii) the passenger flow of bus stations in the weekend is obviously less than that on a working day. erefore, in this paper, we select the rush hours in the morning and evening on each working day and weekend as our time periods. Figure 1 shows the passenger flow during different time periods of Si-hui station, Liu-li-qiao station, Beijing west railway station, and Guo-mao station. It is obvious that passenger flow of different time periods has a great difference. erefore, in this paper, we firstly explored the heterogeneous relationship between fegree and H-index of the node using complex network theory. en, based on the relationship, considering the dynamic passenger flow of the bus station, we constructed a new local hybrid influence model (called SHIP) representing the importance of the station in the whole bus network. Compared with the single index of the complex network such as degree, H-index, and betweenness, we used network efficiency as an evaluation metric to test the effectiveness of the proposed approach. In addition, this paper identified the key station of the Beijing bus network based on this approach and analyzed its spatial distribution characteristics. e proposed method aims to quickly identify key bus stations before the emergencies really happen, which is essential for improving the response speed, as well as preventing the occurrence of secondary disasters. e main contribution of this research includes mainly two aspects: (i) a new index is proposed based on conventional complex network indices combining the degree and H-index to better represent the connected relationship of the bus network; (ii) this approach is a local algorithm for mining and identifying the key bus stations based on topological structure and actual travel demand of the station, which avoids calculating the shortest path between any two stations. e computational complexity is greatly reduced. is paper is divided into four sections. section 2 gives an identification model considering the hybrid influence of degree, H-index, and passenger flow. Section 3 depicts a case study and verifies the model. Meanwhile, we use the new model to analyze the bus network of Beijing, China. e distribution characteristics of key bus stations are given in Section 3. Section 4 proposes a conclusion and work in the future.

Identification Model
In the bus network, station failure can cut off transfer lines, which leads to low effectiveness of the whole network. To enhance the robustness and improve the response speed of the bus network, in this section, we construct a key station identification model. e model includes two stages. Stage one of our model comprises of the bus network construction to calculate the characteristics of the topological structure, while the identification model and evaluation are presented in stage two. e solution framework is shown in Figure 2.

Problem Description.
e previous methods mainly focus on topological structure in identifying key nodes in the whole network. e simplest method to measure the importance of a node is to determine its degree, that is, to calculate the number of its linked neighbors. en, the Hirsch index (i.e., H-index) is originally used to measure the citation impact of a scholar or a journal. In this paper, the Hindex of a node is defined to be the maximum value h such that there is, at least, h neighbors of degree no less than h [41][42][43]. erefore, the node with a large degree and H-index play more important roles in the whole network. e large of the degree represents that the influence of the nodes is big. Nevertheless, the H-index is better than a degree or coreness in quantifying nodes' influence in some cases [44].
Based on that mentioned above, we consider the degree and H-index in identifying the key bus stations. We give a visual representation for the hybrid of degree and H-index, as shown in Figure 3. We describe the degree as D and depict the H-index as H. We can see that nodes a1 and c1 have the same degree (namely D � 3); however, c1 has a larger Hindex than a1, owing to the larger maximal connected subgraph. Nevertheless, only considering the H-index cannot determine the maximal connected subgraph. For example, a1 and b1 have the same H-index, but the degree is different (D � 2 for a1 and D � 4 for b1). erefore, b1 has a larger maximal connect subgraph than a1. Hence, degree and H-index cannot represent the maximal connected subgraph of nodes in networks absolutely. We also find that the product of the degree and H-index can get a maximal subgraph. For instance, D � 4 and H � 2 for b1, D � 3, and H � 3 for c1, and the product of b1, c1 is 8, 9; however, the sum is 6. We can find (intuitively see) that c1 has the larger maximal connect subgraph than b1. erefore, we construct the identification model considering the hybrid influence between the degree and H-index.

Construction of the Network.
e bus network is a typical complex network which consists of bus stations and bus lines. Stations and lines of the bus network can be depicted by nodes and links. Nodes represent stations, and links represent the lines between two stations. erefore, we construct the bus network using complex network theory. In this section, we describe the construction method of the bus network.
Also, the bus network can be abstracted as a graph. In this paper, we use a graph G(V, E) to represent the bus network. is V is the set of nodes, and V is the station set of the bus network. E is the set of edges. G is a N × N matrix which consists of 0 or 1. N is the number of stations. If there is an edge between the station i and j, e ij � 1; otherwise  Advances in Civil Engineering e ij � 0. Stations A and B represent different stations in bus networks. If a bus can be driven from station A to station B, it can also from station B to station A. erefore, an urban bus network can be regarded as an undirected network. e description method of the bus network is divided into two types by connection of stations. One is space L, and the other is space P [45][46][47]. Space L is used to describe the topological properties of the network. Space P is used to represent the transfer properties of the network. Since the method of space L reflects the geographical connection between bus stations, we build the bus network using space L. e space L is depicted in Figure 4 by a simple network.

Influence of Stations.
e degree and H-index are the critical characteristics of nodes. e heterogeneity of the degree and H-index can obtain the maximal connected subgraph. Using hybrid influence as a new metric to measure the importance of stations can better explore the potential influence of stations. e H-index and degree heterogeneity   (1) and (2). We define the hybrid influence of key bus stations as equation (3).
where k i and h i represent the degree and H-index of the station i. H ki and H hi map the heterogeneity of the degree and H-index. |E| is the number of links. ϑ SHIP is the hybrid influence. Ψ is the weight parameter.
Passenger flow of stations is a pivotal parameter that can represent the importance of stations to some extent. Meanwhile, passenger flow of stations at different time and locations vary widely. It is clear that the passenger flow of bus stations differs by period from Figure 2. In reality, transfer stations' failure will cause lots of problems when they happen during peak hours. In order to obtain the key station's distribution of the bus network under special time, we select the morning peak (7 : 30-8 : 30) and off-peak (14 : 00-15 : 00) in a working day and morning peak (7 : 30-8 : 30) in the weekend as the research periods. erefore, we use the normalization of passenger flow as weight calledΨ. e weight parameter Ψ can be expressed as Ψ � x i x 1 + x 2 + · · · + x i · · · + x n , where x i represents the passenger flow of the station i.

Evaluation Metrics
In this section, we use two metrics to evaluate the identification model: average speed and efficiency of the network. e network connectivity of key bus stations has a good performance, which owns many transfer stations. Meanwhile, there are many points of interest such as catering areas, business district, and medical service around the key bus station, leading the bus lines concentrating, a large number of passenger transfer, and more waiting time around the key bus stations. erefore, the vehicle average speed of key bus stations is less than that of other stations with a low hybrid influence. In addition, Shanmukhappa et al. pointed out that, with increasing node weight (demand), the maximum speed closer bus station is reduced significantly [48]. In order to easily obtain the average speed of vehicles near the bus stations, we construct a simulation test network by simulation software named VIS-SIM. In simulation, we set the monitors of speed at 50 meters before the bus station. e average vehicle speed of the bus station i is depicted as where v i,j is the speed of the vehicle j at the station i. n is the total number of vehicles at the station i.

Efficiency of the Network.
Meanwhile, we also test the model with traditional algorithms through the effectiveness of the network which can be represented by equations (6) and (7). e efficiency of the network can reflect the connectivity of the network. e higher the connectivity, the higher the efficiency of the network.
where η efficiency is the efficiency of the whole network. d ij is the shortest distance between nodes i and j. When there is no path between nodes i and j, d ij � inf.N is the number of nodes.

Algorithm Flow
In what follows, we present the detailed procedure of the solution algorithm for calculating the key bus stations. e solution algorithm includes three aspects: construction of the bus network, calculation of hybrid influence, and identification of key bus stations. e flowchart of the solution algorithm is shown in Figure 5.

Model Verification
We selected the Wang-jing district, Beijing, China, as the test bus network which includes 5 lines and 35 stations to verify the accuracy of identification model results. e distribution of stations and lines is shown in Figure 6. Based on the bus network of Wang-jing district, Beijing, China, we construct the simulation network on the platform of VISSIM software for obtaining the average speed of vehicles near the bus station. In test bus networks, we set the same traffic flow. We assume that the signal cycle of each intersection is 90 s, and the detailed signal timing is shown in Figure 7.
e verification results of the model are shown in Figure 8 and Table 1. Figure 8(a) expounds on the relationships between average speed and key bus stations. By simulation, we found that, with increasing hybrid influence of stations, the average speed at the bus stations reduces significantly. e efficiency of the network is the index to measure the connectivity of the network, which is calculated by the shortest distance between two nodes. In this paper, we proposed a station failing process, which is called deliberate attack. When a station is attacked, the edges connecting with the failed station will be removed under space L. In case one station has been attacked, the efficiency of the bus network will change. erefore, the stations are attacked in turn according to the importance of the station. e greater the decline in the efficiency of the network, the worser the connectivity of the network, indicating the station is more important in the bus network.
Meanwhile, the results of the comparison with traditional methods, which include the degree, H-index, and betweenness, are shown in Figure 8(b). From Figure 8(b), it can be seen that the curve of the new index (SHIP) declines fastest, indicating that the new method is more accurate in identifying key stations than other methods. Contemporaneously, the traditional method of computational complexity using the shortest path is O(N n ) � 1225; however, the new approach is O(N) � 35, showing that the computational complexity is greatly reduced.

Data Description.
e bus network of Beijing, China, is selected as the case study in our work. We collected the data about the bus network of Beijing, China, from November 1, 2017, to November 7, 2017. e data consist of 5922 stations and 886 lines. e raw data include card ID, card type, trade type, trade time, line ID, bus ID, name of stations and lines, latitude, and longitude of the station. e data format is shown in Table 2.

Bus Network Description.
Based on the method of space L, we construct the bus network, shown in Figure 6. Meanwhile, the characteristics of topology structure in the network can be described as the measure metrics such as the degree and H-index. We calculate the metrics, and the results are shown in Figure 9. e abscissa represents the degree and H-index of the node. e ordinate represents the number of nodes in Figure 10(a). e ordinate represents the frequency of nodes in Figure 10(b). From Figure 10, the degree distribution of nodes is from 1 to 18. e starting or ending stations of each bus line in the bus network have the same degree which is equal to 1. After statistics, the ratio of starting and ending stations is 4.20%. e proportion of ordinary stations in bus networks accounts for the largest, which is 52.01%. e degree of ordinary stations is equal to 2. e value of the average degree in the network is 3.08, which indicates that each bus station has three edges. Similarly, the distribution of nodes is 1 to 8. e nodes' H-index equal to 2 has the largest proportion, which is 64.35%.

Application, Analysis, and Discussion
In this section, we use the proposed identification model which has been verified to analyze the distribution characteristics of key bus stations.

Regional Distribution.
e results show that the stations with a high value of hybrid influence are most distributed in Dongcheng, Xicheng, Haidian, Chaoyang, Shijingshan, and Fengtai. According to statistics of the top 50 key stations, we count the number of key stations in each region, shown in Figure 11. It is obvious that Chaoyang has the greatest number of key bus stations and Shijingshan owns the least, no matter on working day or weekend. In addition, Haidian has more key bus stations on a working day, indicating that there are more people taking the bus to work.     Note. "SI" represents the station ID. "HI" is the hybrid influence of stations. "AS" maps the average speed of stations.

Spatial Distribution.
e key bus stations occupying the top 50 in different periods are shown in Figures 12-14. We use the red solid circle to represent the measurement of influence. e larger the circle, the greater the influence. According to the spatial distribution of key bus stations, we discuss the underlying reason.
Most of the top 50 key bus stations are located within the 5th ring road. e number of key bus stations on the 3rd ring road is the largest, showing a trend along the loop road. erefore, the loop roads play a significant role in the bus network of Beijing, China. e distribution of key bus stations at different periods is shown in Table 3.
Meanwhile, in three different periods, the key bus stations are ranged from the passenger transport hub such as Si-hui, Dong-zhi-men, Beijing west railway, Liu-li-qiao station, and Guo-mao. In addition, it can be seen from Figures 12-14 that there are two clustering areas in the top key stations, namely, Liu-li-qiao and Guo-mao. Liu-li-qiao station is adjacent to Beijing west railway station, which is the main station of Beijing west railway station for passengers to transfer buses. erefore, the Liu-li-qiao area has attracted a large number of passengers. Guo-mao station is surrounded by commercial areas, including the International Trade Center, SOHO, and many traffic attractions.
ere are a large number of restaurants, hotels, and business areas, which have a great demand for buses. Hence, the identification results are of great significance to public transportation in reality.

Transfer Analysis.
During morning peak and off-peak in a working day, there are 12 key bus stations within 500 m near the subway stations. During the morning peak at weekends, there are 11 key bus stations in the vicinity of the subway stations within 500 m, indicating that the location of key bus stations is close to the subway transfer stations. erefore, the reasonable setting of bus stations can realize the convenient connection between subway and bus transit.

Conclusions
Bus network is a global inefficient network that is difficult to identify key bus stations quickly and accurately with the traditional method. Based on the passenger flow in different periods and the heterogeneous topology structure of the bus network, we used a SHIP model to analyze the spatiotemporal features of key bus stations for the whole network of Beijing, China. e results of this paper can be summarized as follows: (i) A new index is proposed based on conventional complex network indices combining the degree and H-index to better represent the connected relationship of the bus network. From the accuracy of the identification method perspective, we constructed a test network and evaluated the performance of the identification model by two metrics. e results indicate that our method can identify the key stations in the whole bus network more accurately and efficiently when compared to other

Location
Morning peak in the working day Off-peak in the working day Morning peak in the weekend Inner 2nd ring road 3 3 3 2nd ring road 2 2 2 Between 2nd and 3rd ring road 9 11 9 3rd ring road 23 21 21 Between 3rd and 4th ring road 7 8 6 4th ring road 1 0 1 Between 4th and 5th ring road 3 3 3 5th ring road 0 0 0 Between 5th and 6th ring road 2 2 5 6th ring road 0 0 0 methods such as the degree and H-index. Meanwhile, the computational complexity is greatly reduced compared with the method of betweenness. (ii) From the spatial and time perspective, the key bus stations have obvious distribution characteristics. e key bus stations are mostly distributed near the rail transit and passenger terminals, especially in Liu-li-qiao and Guo-mao, which indicates that the key bus stations play an important role in travel convenience. In addition, the distribution of key bus stations has a great difference during the working day and weekend.
is paper provides a useful method to find key bus stations in the bus network, which is helpful to improve the robustness of the whole bus network for planners and managers. However, the method in this study has some limitations which need to be solved. We will consider the transfer time and the number lines of the bus network in the future. Apart from this, the subway and bus network can be seen as a double-layer network, and the interlayer interaction is also a topic worth studying.
Data Availability e data are from the official website of Beijing Metro Rail Operation Administration Corporation Limited.

Conflicts of Interest
e authors declare that they have no conflicts of interest.