Passenger Flow Scale Prediction of Urban Rail Transit Stations Based on Multilayer Perceptron (MLP)

Accurately predicting passenger fow at rail stations is an efective way to reduce operation and maintenance costs, improve the quality of passenger travel while meeting future passenger travel demand. Te improvement of data acquisition capability allows fne-grained and large-scale built environment data to be extracted. Terefore, this paper focuses on investigating the relationship between the built environment around the station and the station passenger fow and discusses whether the built environment data can be applied to the station passenger fow prediction. Firstly, the evaluation system of station passenger fow infuencing factors is built based on multisource data. Te inner relationship between built environment factors and station passenger fow is investigated using the Pearson correlation analysis. Based on this, a multilayer perceptron (MLP)-based passenger fow prediction model was developed to predict the passenger fow at key stations. Te study results show that the built environment factors impact station passenger fow, and the MLP prediction model has better prediction accuracy and applicability. Te results of the study can be applied to predict the passenger fow scale of rail stations without historical passenger fow data and thus are also applicable to new rail stations.


Introduction
With the rapid economic development, urban transportation demand is growing and motor vehicle ownership is increasing, but due to the limited urban construction area, the imbalance between transportation supply and demand can cause trafc congestion and safety problems in large cities, which reduces the quality of travel for residents.Big cities have chosen to vigorously develop rail transportation to solve the aforementioned problems.
Urban rail transit has the shortcomings of high operation and maintenance costs and cannot be developed indefnitely.Usually, the passenger fow prediction work of existing stations utilizes the historical passenger fow data of existing stations.Once the nature of the land or population distribution around the station changes, the historical passenger fow data of the station does not play a decisive role in the future passenger fow scale prediction, so it is necessary to establish a direct relationship between built environment data and passenger fow.With the gradual maturity of the application of big data technology, the acquisition of refned and large-scale building attributes, population characteristics, and other data becomes possible, which provides new ideas for the station's passenger fow prediction.Exploring the relationship between the built environment around the site and the passenger fow, and predicting the passenger fow of the site through the built environment data is the way to accurately grasp the scale of the site's passenger fow and maximize the reduction of operating costs.
In recent years, researchers have conducted in-depth research on urban rail transit passenger fow prediction, mainly focusing on the analysis of rail transit station passenger fow infuencing factors, rail transit station passenger fow prediction objectives, and methods.
At present, the factors infuencing rail passenger fow mainly focus on the following three aspects: population characteristics, transportation facilities, and location factors.Te current status of passenger fow infuencing factors research is shown in Table 1.
In the analysis of factors afecting rail trafc passenger fow, researchers have paid most attention to location factors, mainly focusing on land use and establishing weighted regression models of geographic factors through GIS technology, which is the current hotspot in the analysis of factors afecting rail passenger fow [6].However, current studies rarely consider all three types of factors (demographic characteristics, transportation facility categories, and location factors) at the same time and usually consider only one or two of them, which cannot comprehensively refect the reasons why passengers choose rail transportation to travel, leading to deviations in the prediction of station passenger fow.
Te accuracy of metro station passenger fow prediction determines the construction scale and internal structure of the station, which afects the operation of the station and the travel experience of passengers.At present, rail transit station passenger fow prediction methods mainly include the four-stage method and the direct prediction method [7].
Te four-stage method is a set-count prediction method consisting of the following four components: trafc generation, trafc distribution, trafc mode division, and trafc assignment, which was frst proposed in the Chicago Regional Transportation Study and is one of the commonly used methods for passenger fow prediction at present [8].However, this method counts and analyzes the trafc behavior of each traveler according to the trafc zone, and when applied to the more microscopic scenario of a rail station, it cannot accurately grasp the relationship between the built environment around the station and the passenger fow, and the prediction accuracy cannot be guaranteed [9], and the method itself does not have a high degree of accuracy, and very small errors are passed to the next step, which may cause greater deviations in the prediction results [10].
Te direct demand model (DDM) is a study that takes rail stations as the object, considers the economic and built environment around the station and other elements, and explores the quantitative relationship between them and the passenger fow to predict the future trafc scale.Compared with the four-stage method, DDM is simpler in operation and more accurate in prediction [11] and can be implemented by linear regression models [12], geographically weighted regression (GWR) [13], artifcial neural networks (ANN) [14], and k-nearest neighbor (KNN) [15].Te linear regression model is a regression model that characterizes the correlation between variables through a linear functional relationship [16].Currie et al. [17] conducted a linear regression analysis of the relationship between rail passenger fow and occupational density, car ownership, station level of service, and fares to establish a predictive model for rail passenger fow, and the results of the study showed that the station level of service is the main driver of passenger travel.GWR is a relatively microscopic spatial analysis method that can elucidate the local travel characteristics within the study area [18].Cardozo et al. [19] used a geographically weighted regression model to predict inbound rail stations in Madrid and compared the prediction results with those of ordinary least squares (OLS) and found that GWR has a higher prediction accuracy than OLS, indicating that the spatial analysis technique can be applied as a direct prediction model in the prediction of passenger fow at rail stations.Te k-nearest neighbor is a method that classifes every record in a dataset.Based on KNN, Bai et al. take the trend factor and time interval factor of passenger fow into consideration, to reduce the risk that the original method has fewer evaluation criteria in the matching process [20].Multilayer perceptron (MLP), a commonly used artifcial neural network with features such as adaptive and real-time learning, can be applied to trafc prediction [21][22][23].Lin et al. [24] considered the spatial correlation between passenger fows from the perspective of single stations and the whole network, respectively.Te prediction of metro passenger fow and bus passenger fow was performed using MLP.
With the deepening of the metro network, the attributes of metro stations and the nature of the surrounding land are also changing, so there is an interaction between passenger fow and land use.Existing studies rarely explore in depth the degree of infuence of diferent variables on station passenger fow, which may have an impact on the prediction accuracy with the change of land use nature.For new urban rail transit stations, once built, it is difcult to adjust the station scale and internal layout again.If the actual passenger fow of new stations is underestimated, it will lead to wasted construction costs and high operation and maintenance costs; if the actual passenger fow of new stations is overestimated, it will not meet the travel demand of surrounding passengers and cause station congestion, resulting in long passenger queuing time, lower passenger satisfaction, and poor service level.In addition, few studies can make more comprehensive forecasts of passenger fow from diferent time dimensions.Terefore, for rail operators to accurately predict the passenger fow scale of newly opened stations without historical data, a direct passenger fow prediction model with built environment data is needed.Te main work of this paper is as follows: (i) Obtaining data on the nature of land use, residential and working population, and the number of other transportation connections around the station, building an evaluation system of the station passenger fow impact factors, and analyzing the impact of each factor on the passenger fow of the rail transit station.
(ii) To build a multilayer perceptron-based passenger fow prediction model for rail transit stations, to predict the average daily passenger fow and peak hourly passenger fow of stations under two scenarios: weekdays and nonweekdays and to compare Te rest of the research in this paper is presented as follows: Section 2 presents the data sources of this study and a correlation analysis of the factors infuencing urban rail trafc.Section 3 develops a station prediction model based on a multilayer perceptron.Section 4 discusses the feasibility of the prediction model.Section 5 summarizes the contributions and limitations of this paper, as well as the outlook for future work.

Infuence Factor Selection.
Combining the current status of domestic and international research, four guideline layers of land development intensity, station connectivity characteristics, station surrounding population characteristics, and other transportation connections are selected to explore the infuence of each factor on station passenger fow through seven calculated indicators.

Land Development Intensity.
It is the ratio of the total area of building land to the area of a certain region.Te higher the land development intensity of a certain region, the greater the attraction of the region to the population, and therefore the greater the passenger fow of the regional rail stations.

Station Connectivity
Characteristics.Tis refers to the degree of access a station has to the entire rail network.Te commonly used indicators include median centrality, proximity to the center, the number of connected stations, etc. Te number of connected stations refers to the number of other stations connected to a station.It is generally believed that the accessibility of interchange stations is higher than that of noninterchange stations because interchange stations can change lines and directions.Terefore, the passenger fow is generally higher than that of noninterchange stations.

Population Characteristics around the Station.
Population density and residential or working population density are mainly considered.Te residential or working population density is a group of people with stable commuting needs who need to travel between two places during working days, which will generate a certain amount of rail trafc demand.

Other Transportation Connections.
Te density of shared bicycle connections around the station and the density of bus stop connections around the station are mainly considered.It is generally believed that the more shared bikes or buses that stop within the acceptable walking range around the station, the more convenient it is for passengers; otherwise, passengers are likely to choose to get of at other metro stations with more convenient connections.

Model Construction Ideas.
Te prediction objectives of the model are the average daily passenger fow of the station and the peak hourly passenger fow of the station, and the data content is shown in Table 2. Te independent variables are the built environment factors afecting passenger fow as the model's input parameters, and the data content is shown in Table 3.Two hidden layers are set to build the prediction model of urban rail transit station passenger fow based on multilayer perceptron, and the model structure is shown in Figure 1.Te built environment data and passenger fow of the whole station are used to train the model and predict the station passenger fow.

Model Principle and Parameter
Setting.Te model has four layers: the input layer, hidden layer 1, hidden layer 2, and output layer, in which there are 9 neurons in the input layer and 1 neuron in the output layer, and the number of neurons in the hidden layer is calculated according to the empirical formula in (1), and the number of neurons in the hidden layer of the model is determined as 16.
where p is the number of neurons in the hidden layer and n is the number of neurons in the input layer, which is calculated as 15 and is generally taken as an integer power of 2 according to experience, so the value is 16.Te 7 neurons in the input layer correspond to 7 independent variables, and the input vector of the mth orbit station is X m � (x m1 , x m2 , x m3 , x m4 , x m5 , x m6 , x m7 ) with 288 sets of input vectors.Te weight of the connection between the ith node in layer k−1 and the jth neuron in layer k is w (k)  ij , and the threshold of the ith neuron in layer k is b (k)  i .Te model generates initialized weight values and thresholds for normal distribution, where the weight values are in the range (0, 1).( a 1 (k−1) , a 2 (k−1) , a 3 (k−1) , ..., a (k−1) ) is the output data of the k−1th layer in addition to the input layer, and the neurons of each layer are weighted to sum the data input from the previous layer, as shown in equation ( 2); then, the output is passed through the activation function, f denotes the activation function, and the activation function is generally set as a nonlinear function.Te activation function is generally set as a nonlinear function, which can add a nonlinear transfer function to the model.In this paper, the Sigmoid function is chosen as the activation function, as shown in Equation (3), and a (k) j is the output of the jth neuron in the kth layer.
Te MLP model is usually trained using the error backpropagation (BP) algorithm for the weights and thresholds of each neuron in the model.Te principle of the 4 Complexity BP algorithm is to calculate the error of the output layer according to equation ( 4) and distribute the error backpropagation to all the neurons in each layer and iterate the thresholds and weights through ( 5) and ( 6).Te operations given previously are repeated until the iteration is stopped when any of the following conditions are satisfed, at which time the result of the output layer is the fnal output result.Termination conditions are as follows: (1) Te training time exceeds 20 minutes (2) Te number of iterations exceeds 600 (3) Te error E of the training set is within 10% where n is the number of samples, y i is the true value, and y i ′ is the output of the MLP model.i and  w (k) i are the threshold and weight after correction; η is the learning rate, taking values between (0, 1), and in this paper, we take η � 0.0001.

Compare Model Parameter
Settings.Te MLP model is compared and analyzed with three benchmark models, RBF, KNN, and multiple linear regression, with the following parameter settings.
Te construction idea of RBF is similar to that of the MLP model, but only one hidden layer with 16 neurons is set, the radial basis function (RBF) is used for the transfer function, and the rest of the input and output layers are set in the same way as the MLP model.
Te hyperparameter n_neighbors of the KNN model is obtained by iterating through 1 to 10 to obtain the optimal value, and the Euclidean distance is used to calculate the intersample distance.Te weight of the distance is expressed as the reciprocal to achieve the prediction of the target.
Te multiple linear regression model uses the principle of least squares to fnd the regression equation, and the regression equations for diferent scenarios are shown in Equations ( 7)- (10).
For the workday scenario, when the forecast target is the average daily trafc, y 1 � −29.758x 1 + 8677.585x 2 − 0.591x 3 + 0.773x 4 + 0.829x 5 + 17.287x 6 + 100.898x 7 − 9248.278. ( For the weekday scenario, when the forecast target is peak hour trafc, For nonworking day scenarios, when the forecast target is average daily trafc, For nonworking day scenarios, when the forecast target is peak hour trafc, where x 1 is the building density around the site, x 2 is the number of connected stations, x 3 is the total population density around the site, x 4 is the residential population density around the site, x 5 is the working population density around the site, x 6 is the shared bicycle connection density around the site, and x 7 is the bus stop connection density around the site.

Beijing Metro Station Passenger Flow Forecast
Te metro network in 2017 was selected as the study case, and 90% of the station sample data were randomly selected as the training set and 10% as the prediction set.Te prediction target for each station can be derived from the independent variables of all stations in the prediction set.Each model is trained three times, and the one with the best training accuracy is selected as the prediction model.For the prediction results of individual stations, a few typical representative stations in the prediction set are selected for a detailed description.Tis study counted the daily passenger fow of the whole metro station from April 16 to April 22, 2017, and it can be seen in Figure 5 that the network-wide passenger trafc from Monday to Tursday did not vary much, remaining around 5.48 million, and Friday increased compared to the previous four days, with the number increasing to around 5.74 million.Te network-wide passenger fow on weekends decreased signifcantly compared to weekdays, by 3.7 million and 3.35 million, respectively.It can be seen that the passenger fow at rail stations has diferent characteristics compared to nonworking days, indicating that commuter travel is the main service target of Beijing's rail transit system.
Passenger fow also has diferent characteristics from the station's perspective.For example, the size of the residential population around the residential station will have an impact on the scale of passenger fow [26], and the intensity of land development around the station also makes a diference in the passenger fow attractiveness of the station [27], and the comparison of the average daily passenger fow between the interchange and noninterchange stations is shown in Figure 6, where the average daily passenger fow of interchange stations is generally higher than that of noninterchange stations.

Population Data.
Baidu Wise-Eye Population Data is a commercial geographic intelligence data platform launched by Baidu Maps based on the massive location big data, geographic big data, and road condition big data sources of Baidu Maps, which are capable of being mined to get high accuracy and wide coverage of the residential and employment-population distribution.Te processed population distribution of Beijing in 2017 is shown in Figures 7-9.On the whole, the resident population of Beijing is concentrated in the central urban areas, and the suburban areas are more sparsely distributed.Among them, Chaoyang District and Haidian District have the largest resident populations, 2581997 and 2336486, respectively, while Yanqing County and Mentougou District have the smallest populations, both not exceeding 220,000; Xicheng District has the largest population density of 17862.60774persons/ Number of POI points in Beijing 0 -1,000,000 1,000,000.001-4,000,000 4,000,000.001-9,000,000 9,000,000.001-14,000,000 14,000,000.01 -20,000,000 20,000,000.01 -30,000,000 30,000,000.01 -51,000,000

Calculation Results of Impact Factor Indicators.
To facilitate statistics and calculations, the station impact range is represented as a circle with a uniform radius, and the circle's center is the rail station's location.Because passengers generally reach the endpoint by walking after arriving at a certain rail station or walking to fnd shared bikes and bus stops, only the acceptable range of walking connection is considered in the selection of the radius, and the value is generally 800 m as the researcher derived from the survey [28,29], and the overlapping part of the circular bufer is divided based on the Tyson polygon, which makes the fnal result without overlapping area.Te infuence range of the 800-meter radius of the Beijing rail station is shown in Figure 10.Te building density around the stations is shown in Figure 11.Te calculation results show that the central city is generally higher than the suburbs, and the stations with the highest building density are Jinsong Station, Unity Lake Station, Wangjing Station, and Futong Station, and the nature of the land is mainly commercial service facilities.

Correlation Analysis.
Te results of the Pearson correlation analysis are shown in Figure 16.
Te correlation analysis shows that the correlation indexes of each indicator and passenger fow are above 0.3, which are all correlated, and the correlation between the indicators and passenger fow on weekdays is generally higher than that on nonworking days.Comparing the relationship between the daily average passenger fow and peak hour passenger fow with each indicator, the correlation indexes of the two are relatively close.
In both scenarios, the strongest correlation is the density of shared bicycle connections around the station, indicating that more travelers use shared bicycle connections.Te higher the station trafc, the more people enter and exit the station, and more shared bicycles are needed near the station, and conversely, the number of shared bicycles has an impact on passengers' choice of rail station travel.Te lowest correlations for the workday scenario are residential population density and bus stop connection density around the station.Te nonworking day scenario has the lowest correlation of working population density.
Overall, the high correlation between the three calculated indicators of population characteristics is due to the calculation method of the data, and there is a certain linear relationship between several data points.Te lowest correlation is between the number of connecting stations and each other indicator, and there is no strong correlation between the remaining data.Terefore, these nine indicators are consistent with the multiplicity of factors.
For the training accuracy of the four models, as shown in Tables 5 and 6, the results of the 10% prediction group (a total of 28 groups of data) were selected to analyze the model training accuracy.
Comparing the training accuracy of the four models, the MLP model has a better training efect under the four conditions.Te MLP model has the best performance in predicting average daily ridership, and the multiple linear regression model has the best performance in predicting peak hour ridership.Although the advantage of the MLP model in MAE, RMSE, and MAPE is not obvious, R2 is better than other models, indicating that the MLP model has a good ftting efect.
By comparing the training accuracy of the two prediction targets of average daily passenger fow and peakhour passenger fow, the error of the four models in predicting peak-hour passenger fow is slightly higher than that of average daily passenger fow, but the diference is not obvious.Terefore, the four models apply to the two prediction targets.
By comparing the training accuracy of working day and nonworking day scenarios, it is found that the training accuracy of the RBF model in nonworking day scenarios is slightly higher than that in working day scenarios, the multiple linear regression model has better performance in working day scenarios, MLP and KNN models have no signifcant diference in the accuracy of the two scenarios, indicating that multiple linear regression is more suitable for the prediction of working day scenarios with more regular data.Te other models are suitable for both scenarios.Te absolute error (AE) and relative error (RE) between the predicted and real values are used to evaluate the accuracy of the model, and the calculation formulae are shown in (15) and (16).

Typical Site Prediction
where n is the number of samples, y i is the true value, and y i ′ is the output of the MLP model.Te comparison of the prediction accuracy of the four models for track stations is shown in Tables 7 and 8. Overall, the MLP model has a good training efect in all four cases.Te MLP model performs better overall in the prediction of each type of station; the prediction accuracy of the RBF model and KNN model are highly volatile and slightly less stable; the multiple linear regression model has signifcantly higher prediction accuracy for weekdays than for nonworking days.
Comparing the prediction accuracy of passenger fow under two scenarios, weekday and nonworkday, the prediction accuracy of the four models is generally higher than that of nonworkday for the weekday scenario.Tis result is 12 Complexity related to the commuting behavior on weekdays, where passengers have more defnite travel demand and travel to and from their residence and workplace with more regularity, while the travel on nonworkdays is more random, so the predicted passenger fow from the built environment data may have some deviation.
Comparing the training accuracy of the two prediction targets of daily average passenger fow and peak hour passenger fow, all four models have slightly higher errors in predicting peak hour passenger fow than daily average passenger fow because individual stations do not necessarily have peaks in passenger fow time variation, so the selected peak hours may not be representative.
Te comparison of prediction results of the Wangjing station is shown in Figure 17.Among the prediction results, the MLP model performs the best in terms of stability and prediction accuracy, and the error is less than 30%; the prediction results of the multiple linear regression model and KNN model are less accurate but more stable; the prediction accuracy and stability of the RBF model are poor, and the prediction error ranges from 16.77% to 70.34%.
Te comparison of prediction results of Xuanwumen Station is shown in Figure 18.In the prediction results, the MLP model and the RBF model perform better, with prediction errors below 40%, while the remaining two comparison models are less stable, with errors greater than 50%.Te four models have the largest errors in the prediction results of nonworking day peak hour passenger fows.
Te prediction results of Wangfujing station are shown in Figure 19.Among the prediction results, the MLP model has the highest accuracy, with all errors below 20%.However, all models have no signifcant advantage in predicting    20.In the prediction results, the prediction accuracy of the MLP model for the daily average passenger fow is better than that of the comparison model; in terms of the peak hourly passenger fow, the prediction results of the MLP model and the comparison model are diferent, and the prediction error of the MLP model is at a medium level.
Te prediction results of the Sun Palace station are shown in Figure 21.In the prediction results, the prediction accuracy of the MLP model is the best in the weekday scenario and slightly inferior to the comparison model in the nonweekday scenario, but the overall prediction errors are all below 15%.Te prediction accuracy of this station is also the highest among all stations because the average daily passenger fow of Sun Palace station is about 40,000-50,000, and the passenger fow is characterized as "bimodal," which   16 Complexity is more common in Beijing's rail transit system, so there are more samples of this type during training and the prediction accuracy is higher.

Discussion
In the decision-making stage before the line or station is put into construction, the forecast results of the average daily passenger fow can be used as a reference for the scale of station construction and line use models, and subsequently, the impact of new lines or stations on existing lines or stations can be analyzed on this basis.Te peak hourly passenger fow refects the number of passengers a station needs to accommodate in a short period, and the prediction results provide a theoretical basis for designing station facility parameters such as station escalator width to avoid crowding caused by actual passenger fow exceeding the designed passenger fow or waste of resources caused by actual passenger fow much smaller than the designed passenger fow, to achieve the goal of reducing rail   (1) A total of seven indicators in four aspects, namely, land development intensity, station connectivity characteristics, station peripheral population characteristics, and other transportation connections, are explored to infuence the degree of rail transit station passenger fow.Te results show that the correlation between each indicator and weekday passenger fow is generally higher than that on nonworking days, and the indicator with the highest correlation is the density of shared bicycle connections around the station, indicating that more travelers use shared bicycle connections.Te higher the passenger fow of the station, the more people enter and exit the station, and more shared bikes are needed near the station, and conversely, the number of shared bikes will have an impact on passengers' choice of rail station travel.predictions shows that there are diferences in the prediction accuracy of diferent stations.Te Wangjing and Sun Palace stations with the highest prediction accuracy are both characterized by an average daily passenger fow of 40,000 to 50,000 people per day, with obvious "bimodal" passenger fow characteristics, which are more common in Beijing's rail transit system, so the training has more samples of this type and higher prediction accuracy.

Conclusion
However, due to the study's time, level, and conditions, the paper has some shortcomings, such as the prediction target being a single station and the impact of the new station on the whole network is not considered.

Figure 1 :
Figure 1: Prediction model of passenger fow of subway stations based on multilayer perceptron.

Figure 3 :
Figure 3: POI distribution by administrative regions in Beijing.

Complexity
Results.In this paper, among the 28 Beijing rail transit stations in the prediction set, typical representative stations are selected according to the scale of passenger fow, surrounding land use, and population distribution of each station as the objects of accuracy verifcation of the prediction model, namely, Wangjing Station, Xuanwumen Station, Wangfujing Station, Sanyuanqiao Station, and Sun Palace Station.

Figure 16 :
Figure 16: Pearson correlation coefcient analysis of each index and average daily passenger fow.(a) Weekdays; (b) nonworking days.

Figure 18 :
Figure 18: Comparison of prediction results for Xuanwumen station.

Figure 19 :
Figure 19: Comparison of predicted results of Wangfujing Station.

Figure 20 :
Figure 20: Comparison of prediction results of Sanyuanqiao station.
Based on mining Beijing POI data, Beijing rail transit AFC data, and Baidu Wise-Eye population data, the article establishes a passenger fow prediction model based on multilayer perceptron for predicting the average daily and peak hourly passenger fow of stations and verifes the accuracy of the prediction results by taking Wangjing Station, Xuanwumen Station, Wangfujing Station, Sanyuanqiao Station, and Sun Palace Station as examples.Te prediction results of the RBF model, KNN model, and multiple linear regression model are compared to verify the efectiveness of the MLP model.Te main research results and conclusions are as follows:

( 2 )
Te MLP-based rail station passenger fow prediction model is constructed, and three comparison models, the RBF model, the KNN model, and the multiple linear regression model are set up to analyze the prediction results of typical stations.Te training results of the RBF model fuctuate at individual stations and have a large degree of dispersion, thus causing poor stability in station prediction as well.(3) Te prediction of typical stations with representative

2 Complexity Table 1 :
Current status of research on factors infuencing passenger fow.

Table 3 :
Model input data content.

Table 2 :
Model output data content.

Table 4 .
Tere are eight working days and three nonworking days in the selected 11 days, and the average daily data entries are about 6 million for working days and 4 million for nonworking days.
[25]DataSources 3.1.1.POI Data.Point of interest (POI) usually refers to geographical objects that can be abstracted as point markers.Tis study uses 2017 rail transit data, and the heat map of POI data distribution in Beijing after processing is shown in Figure 2, and the distribution of POI by administrative districts in Beijing is shown in Figure 3. Overall, the POI points in Beijing are concentrated in the central urban areas, and the suburban areas are more sparsely distributed.Among them, Chaoyang District has the highest number of POI points, with 296117 points, and Mentougou District has the lowest number of POI points, with only 14997 points.Te density of POI points in the east and west urban areas is the largest, with 1592.329243/km 2 and 1557.074033/km 2 , respectively, the most densely distributed, while the density of POI in Mentougou District, Yanqing County, Huairou District, and Miyun County is lower, with less than 20/km 2 .3.1.2.AFC Data.AFC data has the advantages of fne data, large scale, and fast system update and has been better applied in the research of passenger fow prediction and residents' travel behavior patterns[25].Te data used in this paper are the AFC data for a total of 11 days from April 12 to 22, 2017.A schematic diagram of the metro network in 2017 is shown in Figure4, with 19 lines and 288 operating stations.Te data recorded by the AFC system have a total of 19 felds, and the commonly used key felds are shown in

Table 4 :
Commonly used key felds in AFC data.
OFF_STATION_NAME Name of outbound station Magnetic device mouth OFF_STATION_TIME Te outbound time 2017/4/12 8 : 24 : 10 Complexity km 2 and is the most densely distributed, while Yanqing County has the lowest population density of fewer than 2000 persons/km 2 .

Table 5 :
Model training accuracy analysis-average daily passenger fow.

Table 6 :
Model training accuracy analysis-peak hour passenger fow.Bold values are the values with the smallest errors in prediction results for each model under diferent scenarios.It shows that the model has the least training error.

Table 7 :
Accuracy analysis of passenger fow forecast at key stations-average daily passenger fow.

Table 8 :
Accuracy analysis of passenger fow forecast at key stations-peak hour passenger fow.Bold values are the values with the smallest errors in the prediction results of the four models for each site under diferent scenarios.
Figure 17: Comparison of prediction results of Wangjing station.