Examining Built Environment Effects on Metro Ridership at Station-to-Station Level considering Circle Heterogeneity: A Case Study from Xi’an, China

Transit-oriented development is described as a geographic unit with multicircle structures. Most studies have analysed the impact of the built environment within station catchment areas on metro passenger fows from a macro perspective and have lacked analysis of the circle heterogeneity. Few relevant studies have independently investigated the impact of the built environment on the passenger fow in each circle and indeed neglected the systematic interaction between inner circles and circles in the TOD area. In this study, the 800m bufer from the station was equally divided into four circles. Based on the gravity model, the representative built environment features around the metro stations on both sides were extracted using the block attention module (BAM). Subsequently, Shapley Additive exPlanation (SHAP) was used to explore the infuence of diferent built environment variables on passenger fow at each circle between the origin and destination stations. Te results indicate the following: (1) the station-to-station passenger fow is signifcantly afected by the availability of transfers and the distance between the origin and destination stations; (2) the impact of diferent built environments on ridership signifcantly varies within diferent circles; and (3) the built environment has a similar impact on average daily passenger fow on both sides. Terefore, this study proposes strategies to optimize the metro passenger fow by developing diferent land use in diferent circles and updating the urban spatial structure.


Introduction
Te metro system has been prioritized in China to address the problems of trafc congestion and environmental pollution owing to automobile-based transportation.Between 2015 and 2021, the number of cities with metros in China increased from 26 to 49, and the total length of the network exceeded 9000 km.Despite the impact of the antiepidemic policy, the average daily trafc intensity still reached 4800 persons/km.Meanwhile, transit-oriented development (TOD), which integrates the metro system and land use development, was applied in many cities.Many studies in this area focused on the relationship between the built environment and metro ridership at the station or stationto-station levels [1][2][3].
Since its inception, the TOD catchment area has been considered as a geographical unit with a multicircle structure to enhance the use of public transport [4].Te previous literature has highlighted the signifcant important roles of density, diversity, and nonmotorized friendly design on metro ridership [5,6].However, one critical question that has surfaced is how the land use layout within the catchment area afects ridership.Exploring the impact of diferent land use in diferent circles on ridership is key to addressing the question.Furthermore, while some studies about the delineation of TOD catchment area have involved exploring the circle heterogeneity of the built environment's infuence on passenger fows, there are two gaps.On the one hand, the studies generally developed multiple models to explore the impact of the built environment of diferent circles on passenger fow, respectively, ignoring the results of the mutual infuence between diferent circles in the TOD area as an independent geographical unit.On the other hand, due to the black box nature of deep learning models, most of the studies have had to abandon their predictive power and revert to traditional statistical or machine learning models, which ofer better explanatory capabilities but poorer predictive performance.
Against this background, this study aims to investigate the circle heterogeneity of the impact of diferent land use types on station-to-station passenger fow, considering the built environment factors of the station catchment area.To achieve this aim, the convolutional block attention module model was selected due to its requirement for less sample data while ensuring prediction accuracy [7].Tis type of attention mechanism was improved to extract the combined characteristics of the originating and destination stations, respectively, and modelled the average weekday metro passenger fow within the framework of a gravity model.Shapley Additive exPlanation (SHAP) was used to interpret the model results.
Te main contributions of this study include the following aspects.First, this study strengthens the analysis of the impact of built environment factors on ridership at the station-to-station level and provides limited experience for deep learning models to analyze and interpret rather than just accurately predict travel behavior.Second, in view of the TOD multicircle structure, the circle heterogeneity of the impact of the built environment on ridership is studied to provide more feasible suggestions for planners.Te remainder of this paper is organized as follows.Te next section reviews the relevant literature.Section 3 presents an overview of the study area and data collection.Te framework of the model is shown in Section 4. Section 5 presents and discusses the results.Finally, Section 6 provides the major conclusions and proposes potential applications of the study.

Literature Review
In the last three decades, TOD has become a focus in the feld of urban planning and transportation [8,9].A better understanding of the impacts of station-area built environment factors on transit ridership can improve transit performance and inform land use within station catchment planning.
In terms of land use, almost all the existing studies have highlighted the infuence of population [6,[10][11][12] and employment densities [11,13,14] on ridership.Te fndings on the relationship between the diversity and metro ridership are not consistent.Te entropy index, a measure of diversity, positively infuences metro ridership [15,16].By contrast, Cervero [17] found that the land use mix has no evident impact on metro passenger fow.Several TODrelated studies also focused on land use variables.Many scholars investigated the efects of the diference in land use or the diferent types of POIs on metro ridership [11,[18][19][20][21][22][23][24].For example, An et al. [25] suggested that the commercial building is the most critical factor for the metro ridership prediction and the efect of residential factors was inconclusive.Li et al. [26] concluded that only common residences efectively improve metro ridership and suggested that the residential and scenic spots have nontrivial efects on ridership.
In terms of transit service, variables such as road density, intersection density, number of bus stops (or lines), terminal station, and transfer station that are associated with ridership were explored.Some studies highlighted the infuence of intersection density on ridership [13,26].However, the negative efect of intersection density was reported in another study [25,27].Some previous studies also concluded that the number of bus stops around the metro station could infuence the metro passenger fow because the bus is one of the primary egress and access modes to metro stations [28][29][30].E. Chen et al. [31] concluded that the efects of the built environment around the terminal or adjacent stations are more signifcant than those around the normal station.
Research on the station-to-station level can be considered an extension of that on the station level.Te mentioned explanatory variables were calculated for both the origin and destination.Moreover, the station-to-station level ridership is afected by trafc impedance factors, including transfer times, detours, and route distance [32][33][34][35][36].
Most TOD development guides and related studies described the TOD as an area with a multicircle spatial structure (shown in Figure 1) and emphasized the development of diferences according to the diferent circles of TOD.Several studies have divided the station catchment into multiple bufer bands and modelled the impact of the built environment on passenger fows within each bufer band.Gutiérrez et al. [37] developed distance-decay weighted regression using the mobility survey to forecast the Madrid metro ridership.Tey found that the efect of built environment factors on ridership changed with different bufers.Similar experiments conducted by Manout et al. [38] in Lyon also yielded the same conclusion.Te built environment factors in the bufer zones of 0-300 m, 300-600 m, and 600-900 m were counted by Jun et al. [15], and the geographically weighted regression models were calibrated to explore the impact of land use characteristics in these bufer zones on ridership at the station level.Te results showed that only the population and land use mix signifcantly afect the ridership in the 0-300 m and 300-600 m bufers.Pan et al. [21] conducted a questionnaire survey of 33 sites and 11 neighborhood units in Beijing and collected 300 responses to analyze the infuence of shopping facilities within seven circles on the willingness of residents to shop around the nearest metro station.Te results from the model revealed that the infuence of diferent shopping facilities on residents' shopping trip willingness varies with diferent circles in three aspects: signifcance, sign, and value.
Regarding the models that examine the relationship with ridership, the direct ridership models (DRMs) are popular owing to the minimal data requirement and easy application.Initially, the relationship between the built environment and metro ridership was assumed to be linear or log-linear [15,37,39].With the development of data mining methods, the traditional linear model has gradually been supplemented by machine learning models.Tese models 2 Journal of Advanced Transportation have better predictive and explanatory capabilities.For example, tree-based models have been developed in related studies because they are more adept at dealing with nonlinear relationships [28,34,40,41].However, successful machine learning models require extensive expertise in capturing highly accurate features whenever possible.For metro ridership, it is difcult to achieve high-quality feature engineering manually.Tus, deep learning models are widely used in transportation because of their excellent performance in feature extraction.For example, recurrent neural network (RNN) models, which are suitable for modelling dynamic temporal dependency occurring in time series, were widely used to predict ridership in isolated metro stations or lines [42,43].As convolutional kernels of diferent sizes can extract spatial dependencies of features by automatically learning from the data, convolutional neural networks are often used for ridership prediction at large spatial scales [44].
Although previous studies explored the relationship between built environment factors and ridership at the station or station-to-station level, they have some limitations.Te infuence of various built environment factors in the station catchment area on passenger fow is not independent, and it is the result of cross-infuence within the TOD area as a system of independent geographical units.In other words, the spatial structure and land use layout of the catchment area also signifcantly impact passenger fows.Furthermore, the results of studies that focus only on global built environment factors are undoubtedly biased.However, although deep learning models performed well in extracting the implicit features of these interactions, they are rarely used to explain the passenger fow correlation owing to their black box.Modelling the efective extraction of features and their rational interpretation is extremely important for TOD planning and construction practice.
Overall, this study aims to fll the gaps in the literature by (1) quantifying the circle heterogeneity of the built environment's nonlinear efect on ridership and (2) using deep learning models to capture potential land use features within the station catchment and explaining the model results.

Dataset Description
As shown in Figure 2, the Xi'an metropolitan area was selected for the study.From 2011 to 2019, four metro lines and 57 stations were built and operated in Xi'an, making the city rank eighth among 40 Chinese cities with metro lines.A total of 3192 pairs of the average daily station-to-station passenger fow on the weekday during November 2019 were counted using Auto Fare Collection (AFC) data.Although the number of metro users is staggering, the development of the metro system is still confronted by the uneven distribution of passenger fow.In particular, the Zhonglou-Xiaozhai route has the largest average daily passenger fow with 2836 riders, whereas the passenger fow for Xinjiamiao-Daminggongbei route is one rider.To optimize travel demand management, it is important to study the infuencing factors of OD passenger fow.
It should be noted that this study considered the land use factors and transit service within the 800 m bufers of both origin and destination stations.Moreover, considering the average scale of plots in Xi'an and the sample size of the dataset, the 800 m bufer was divided into four circles: 0-200 m, 200-400 m, 400-600 m, and 600-800 m.Te number of bus stops (S) within the four circles between the origin and destination stations was counted as the transit service factor.Te areas of the administration buildings (A), residential buildings (R), and commercial buildings (B) within the diferent circles were calculated to represent the land use factors of the origin and destination sides.Te parks, squares, and scenic spots were classifed into one category (G) to calculate the land use area.Some other studies also calculated building areas according to industrial and warehousing logistics [23].However, most of the two types of land use are located outside the study area.
To preserve the location information of diferent circles, a matrix was constructed to represent the fve features of the four circles, considering the built environment of the origin or destination stations (shown in Figure 3).In addition, the number of transfer times and interval (the number of stops) between the origin and destination stations was used to determine the impact of travel impedance factors on the metro ridership.Table 1 describes the details and source of the variables.

Method
As shown in equation ( 1), the original gravity model is expressed in the multiplicative form to examine the relationship with station-to-station ridership.
where T OD denotes the station-to-station ridership.X O i , X D i , and X C i represent the ith independent variable of the station O, D, or the trafc impedance C, respectively.z, z O , z D , and z C are coefcients that could be estimated via logarithmic transformation.
Diferent from the original gravity model, the block attention module-gravity (BAM-Gravity) model proposed in this study uses the block attention module to extract built where T OD and I C denote the ridership and the trafc impedance variables from the origin station to the destination station, respectively.I O and I D represent the built environment variables of the stations on both sides.A O (•)  [45].Shapley Additive exPlanation (SHAP), a method from coalitional game theory that is as important as permutation features, is an inspection technique that can be used for any model [46].Te SHAP values measure a features importance by calculating the average of its efect on the prediction under diferent circumstances.Specifcally, it can be defned as where I i represents the importance of the ith built environment variable on the station-to-station ridership and n is the number of the samples.θ j i denotes the efect of the ith variable of the jth sample on ridership and is defned as follows: \x j i refers to the variable combination that does not contain the ith variable of the jth sample, p is the number of the variables, and F(•) represents the trained BAM-Gravity model.

Results and Discussion
To detect overftting, 25% of the total samples were randomly selected as a test subset, and the 5-fold cross validation was used during the training process.Before applying the BAM-Gravity model, all variables were standardized.After tuning hyperparameters, when the learning rate and number of training times were 0.005 and 1200, the R 2 for the BAM-Gravity model was 0.88.

Importance of the Independent Variables.
Te contributions of the independent variables for station-to-station ridership are shown in Table 2. Overall, there are two interesting fndings that should be noted.First, the contribution of the commercial/business buildings (374.66) to the passenger fow is maximum, followed by the residential buildings (368.22) and trafc impedance (276.83).Tis result is consistent for most studies, and most metro trips on weekdays are to and from residential and commercial/ business buildings that provide primary residence, daily shopping, and job opportunities [6,14,18,20,24].
Second, the contribution of the built environment variables in the third circle layer to ridership (403.32) is the highest, followed by the second (353.22),fourth (298.43), and the frst circles (258.57).To explain this counterintuitive result, the relationship between built environment variables in each of the circles and ridership is discussed and analyzed in detail in the following subsections.

Relationship between Ridership and the Residential
Buildings.Figure 7 displays the relationship between station-to-station ridership and the residential buildings in diferent circles of origin and destination sides.In addition to the frst circle, residential buildings in the other circles were found to signifcantly contribute to passenger fow.In particular, as shown in Figures 7(a Almost all the studies claimed that residential buildings or population density promote growth in passenger fow [25,47,48].However, several further fndings are worth noting; the residential buildings in the frst circle do not signifcantly   ).Several studies demonstrated that residential buildings or population densities within an appropriate threshold have a signifcant impact on passenger fow [34,41,47].Fewer residential buildings in the frst circle do not fall within the efective threshold for infuencing ridership.Moreover, the relationships between residential buildings in diferent circles and passenger fow are slightly diferent, exhibiting exponential, logarithmic, and linear growth in the second, third, and fourth circles, respectively.Te combination of distance to the metro station and the area of land available for residential development leads to these results.

Relationship between Ridership and the Busi-ness\Commercial Buildings.
As shown in Figure 8, the business\commercial buildings have signifcantly impact on ridership in the frst (32.51  Te results of this study difered slightly from those of previous studies [25,49], and this diference is attributed to a combination of reasons.First, the development of large commercial facilities mainly for leisure and entertainment has a certain agglomeration efect and is mainly concentrated in the frst circle, whereas the commercial facilities scattered in the fourth circle are mostly small retail businesses providing daily services within a walking distance for residents; this not only reduces the dependence of surrounding residents on the metro but also makes it less attractive to outside residents.Second, the frst circle has higher land prices owing to its transportation advantages, and most of the large commercial or business ofce facilities within its boundaries are geared towards the middle and upper-income groups.Te well-equipped parking facilities and the preference of this group to travel by private car result in a less-than-expected contribution of the commercial buildings in the frst circle.As a result, business\commercial buildings have the most signifcant negative efect on ridership in the fourth circle, while in the frst circle, they do not have the expected degree of impact.

Relationship between Ridership and the Public Green Land Area\Public Administration Buildings. As shown in
Figures 9(a)-9(d), there are signifcant negative correlations between public green space and passenger fow in the second (48.39 for O, 66.48 for D) and third (27.53 for O, 30.51 for D) circles, regardless of whether it is at the origin or destination stations.
Although Du et al. [47] and similar studies [25,50] concluded that the public green spaces would attract more passengers and the impact has no signifcant diference between weekdays and weekends, the results of this study are signifcantly diferent.Despite the fact that Xi'an is a wellknown tourist destination with many places of interest, the corresponding AFC data analyzed for this study show that during low tourist seasons such as November, when      exhibition halls are hardly attractive to residents with busy daily work schedules.

Relationship between Ridership and the Trafc Impe-dance\Bus Stops.
Te number of bus stops in the frst circle at the origin station has a diferent impact on ridership from that at the destination station.Figure 10(a) shows that when the number of bus stops in the frst circle at the origin station is 3, the maximum number of passengers is reached, exceeding the average by approximately 70 persons.Figure 10(b) shows that the more the bus stops in the frst circle on the destination side, the higher the passenger fow.Te results are attributed to three reasons.First, the existing metro system does not yet fully meet Xi'an's travel demand because a signifcant proportion of trips are made using the metro-bus connection or the bus-metro connection.Second, travelers prefer the metro-bus connection to the bus-metro connection.Eighty random interviews on the preference to metro-bus or bus-metro connections were conducted to explore the underlying reasons for this result.Te majority of responses received include better punctuality and speed of the metro than bus and guaranteed subsequent travel arrangements after taking the metro frst.Finally, some people would consider abandoning the metro if there was a direct bus to their destination, or if they do not intend to save time.Terefore, the optimal number of bus stops on the origin side can promote an increase in ridership; however, overly developed bus routes can lead to a decrease in ridership.
Other studies claimed that the number of bus stops on both sides positively afects passenger fow [11,26,41].However, Figures 10(c) and 10(d) describe the impact of the number of bus stops in the second circle on ridership as negative, regardless of whether they are on the origin or destination side.Te second circle of bus stops competes with the metro, with a greater number of stops leading to a higher chance of abandoning the metro for the bus.
Te number of transfers (228.42,Rank 1) is the most important feature that afects OD fows.It corresponds to the fndings of the other studies [34,36]; people are more likely to abandon the metro because of the transfers (Figure 10(e)).Figure 10(f) shows that the relationship between the intervals and passenger fow is a parabolic curve.In particular, when the interval between two stations is in the range of 0-2 or 10-18, the OD passenger fow is lower than average.Furthermore, when it is in the range of 3-9, ridership is above average.It is consistent with the results of Gan et al. [34], as both studies conclude that most metro trips are taken to cover medium distances, and considerably short travel distance cannot refect the advantages of metro reliability, thereby resulting in more alternative modes of travel options.

Conclusion
To investigate the infuence of built environment factors in diferent circles of the origin and destination stations on passenger fow, fve types of built environment variables (administration buildings, residential buildings, commercial/business buildings, land area of the parks/ squares/scenic spots, and the number of bus stops) in four circles (0-200, 200-400, 400-600, and 600-800 m) of the origin and destination stations and the travel impedance variables (transfer times and intervals) were used for modelling the metro station-to-station ridership.Te BAM-Gravity model was employed to detect the relationship with ridership, and SHAP was used to explain the modelling results.Te results of this study are expected to provide planners with better actionability.First, the transfer is the most critical factor afecting the metro station-to-station passenger fow.Te result indicates that the coupling between urban spatial structure planning and rail transit network planning should be strengthened.In other words, it suggests that policymakers can improve connective efciency by directly connecting two important areas or gradually reducing the functional connection between the two regions in the process of urban renewal.Moreover, the result shows that people are more inclined to ride the metro for medium distances.Based on this result, planners should pay more attention to the spatial connection between the two areas at this scale in optimizing the urban structure.
Second, the bus stops within the frst circle at the origin and destination exhibit a parabola and a positive correlation with the ridership, respectively.Tey exhibit negative correlations in the second circle, regardless of the origin or destination station.Te results show that the existing metro network in Xi'an is not adequate to cover the city's travel demands.Te results also highlight the role of integrated development of bus and metro systems in regulating the metro passenger fow.When the utility of metro travel needs to be increased, the connection between the metro and the bus systems should be strengthened in the frst circle on the destination side and the number of bus stops (at most 3) on the origin side should be appropriately increased.When the metro passenger fow pressure needs to be relieved, the number of bus stops within the second circle should be increased.Furthermore, transportation planners and operators can adjust metro ridership by optimizing the bus schedules of the frst and second circles on both sides.
Finally, and most importantly, the results of this study reveal the diferences and similarities in the impact of different land use factors within diferent circles on passenger fow.Te fndings provide some reference for the optimization of land use around the metro station, whether it is around the origin or destination station.Te residential buildings contribute to noticeable improvements in passenger fow, and the improvement level arranged in the sequence is as follows: third, second, and fourth circles.It suggests that planners should focus more on the circle-level heterogeneity of the impact of residential buildings on ridership and rationally develop residential land in each circle.Similarly, planners can adjust the passenger fow by planning commercial or business ofce buildings in diferent circles.Although the land use of parks and squares has an inhibitory efect on the passenger fow in the second and third circles, it does not mean that the development of land use such as parks should be prohibited.Planners should make the most of broken spaces by arranging public open spaces along major pedestrian routes and enhancing the landscaping of pedestrian spaces.
Although this study enriches the existing research on TOD, some aspects should be further explored.First, the results show that the built environment has essentially the same impact on average daily ridership at the origin and destination stations, but there is no discussion regarding the impact on passenger fows at diferent periods of the day.Second, this study does not investigate the competition between travel modes and the metro in terms of trafc impedance.Tird, based on the density of the road network in Xi'an, 200 m was used to delineate the station catchment area circles; however, a smaller bufer band would have led to more detailed results.Tese limitations of this study can be addressed in the future.
) and 7(b), the residential buildings in the second circle of the origin and destination stations afecting the ridership show a similar trend of exponential contribution to passenger fow.When the residential buildings are below 50,0000 m 2 , the ridership is below the average and fat values in the 0-50 range.With the increase in residential foor area, ridership exceeds the average by a maximum of 150 persons.Te residential buildings in the third circle of the origin (Figure7(c)) and destination (Figure7(d)) stations promote ridership with logarithmic trends.Gradual and slow increase in ridership are observed when residential buildings are greater than 125,0000 m 2 .Figures7(e) and 7(f ) depict the increase in ridership with increasing number of residential buildings in the fourth circle on both sides.When the residential buildings are less than 100,0000 m 2 , the ridership is below average.

Figure 6 :
Figure 6: Te structures of the fully connected layers.

41 *
(Rank 10)    For reasons of space, the top 24 variables (with * ) are considered to signifcantly afect ridership.Journal of Advanced Transportation temperatures are cooler, fewer people exercise in parks or visit scenic areas.Terefore, the larger public open space compresses the size of other facilities, leading to a reduction in ridership.Te public administration buildings signifcantly afect ridership in the frst circle only (33.64 for O, 31.77for D).Te efective thresholds for public administration buildings on passenger fows at the origin (Figure 9(e)) and destination

Figure 7 :
Figure 7: Te relationship between ridership and the residential buildings.

Figure 8 :
Figure 8: Te relationship between ridership and the business\commercial buildings.

Figure 9 :
Figure 9: Te relationship between ridership and the public green land area\public administration buildings.

Figure 10 :
Figure 10: Te relationship between ridership and the bus stop\travel impedance.

Table 1 :
Description of the independent and dependent variables.
for O, 31.08 for D), third (60.48 for O, 35.62 for D), and fourth circles (96.98 for O, 85.05 for D).In particular, the business\commercial buildings in the frst and third circles are positively correlated with ridership, whereas in the fourth circle, ridership is inhibited, regardless of the origin or destination station.

Table 2 :
Te importance of variables (mean of SHAP values).