Prediction of Daily Entrance and Exit Passenger Flow of Rail Transit Stations by Deep Learning Method

The prediction of entrance and exit passenger flow of rail transit stations is one of key research focuses in the area of intelligent transportation. Based on the big data of rail transit IC card (Public Transportation Card), this paper analyzes the data of major dynamic factors having effect on entrance passenger flow and exit passenger flow of rail transit stations: weather data, atmospheric temperature data, holiday and festival data, ground index data, and elevated road data and calculates the daily entrance passenger flow and daily exit passenger flow of individual rail transit stations with data reduction. Furthermore, based on the history data of passenger flow of rail transit stations and relevant influence factors, it applies the deep learning method to choose the relatively optimal hidden layer node by means of the cut-and-try method, set up input data and labeled data, select the activation function and loss function, and use the Adam Gradient Descent Optimization Algorithm for iterative global convergence. The results verify that this method accurately predicts the daily entrance passenger flow and daily exit passenger flow of rail transit stations with the prediction error of less than 4.1%. Finally, the proposed model is compared with the linear regression model.


Introduction
The urban rail transit systems with high transportation capacity, fast speed, punctuality, low unit energy consumption, low environmental pollution, safety, and reliability are an important part of urban public transportation.The monitoring of urban rail transit passenger flow is included in the critical daily work of rail transit operation units.Accurate passenger flow predictions constitute the basis of reasonable resource allocation for the operation units and of ground public transportation dispatching for public transportation units, so that travel requirements of citizens can be effectively met and their travel experience can be improved while sudden events of high passenger flow can be prevented to protect public safety.Thus, the accuracy and scientific rationality of passenger flow prediction are critical to the operation and management of urban rail transit.
There are many factors having effect on rail transit passenger flow.Domestic and foreign scholars have made a lot of researches on them.Li et al. [1] established a multivariable regression model and quantitatively researched the effect of weather factors such as temperature and rainfall on random fluctuation of daily passenger flow of public transportation.Xu et al. [2] utilized the rail transit data of Beijing and discovered that the passenger flow in holidays and festivals varied more significantly than the passenger flow in ordinary days.With a questionnaire survey on rail transit passengers in Beijing, Zheng [3] discovered that the passengers taking rail transit for interchange mostly arrived by walking and the second major source of the passengers was ground public transportation.It could be seen that ground transportation has effect on the flow of passengers taking rail transit for interchange.Not only ground transportation but also elevated transportation has effect on rail transit passenger flow.For travelers, the selection of parking and then taking rail transit for interchange or other traveling modes is related to the conditions of ground transportation and elevated transportation near rail transit stations.Wang et al. [4] researched the site selection model of elevated transportation and urban rail transit-parking interchange facilities and concluded that the potential demand for parking and interchanging at an exit of elevated road was proportional to the daily mean vehicle flow of the elevated road exit.It could be seen that elevated transportation conditions have effect on the flow of passengers parking and taking rail transit for interchange.It is sure that some other minor dynamic factors are also present, such as humidity and air pollution index.However, with relevance analysis by R language, it has been discovered that these factors have very low degree of relevance as well as limited effect.At present, many achievements are also obtained in the area of quantitative research on rail transit passenger flow prediction.Li et al. [5] utilized the MSRBF model with the data of Beijing IC card and predicted the rail transit passenger flow under the circumstance of sudden events.Anvari et al. [6] put forward the Box-Jenkins method based on timing characteristics of passenger flow and utilized this method to predict passenger flow of Istanbul Rail Transit in individual time frames.The mixed EMD-BPN prediction model put forward by Wei and Chen [7] predicted the shortterm passenger flow of rail transit in three phases including EMD phase, element identification phase, and BPN phase.Sun et al. [8] put forward the mixed mode Wavelet-SVM method and predicted the short-term passenger flow of Beijing Rail Transit in three phases including breakdown phase, prediction phase, and rebuilding phase.The abovementioned researches put forward prediction methods and modelling for rail transit passenger flow, but they made little discussion about the rationality of data set partition to be used and were not combined with the multiple factors having effect on rail transit passenger flow.Some of them made prediction according to the change rule of IC card data with time and the data magnitude of their tests was limited so that it was difficult for the resultant model to handle the processing of mass data and the error range of prediction results was excessively large.
In conclusion, researches on passenger flow characteristics remain in the simple passenger flow rule research phase and researches on the relevance relation between passenger flow characteristics and their influence factors are relatively few.In view of this reason, this paper puts forward the prediction of entrance and exit passenger flow of rail transit stations by using the deep learning method on the basis of big data of rail transit IC card and in combination with the data of major dynamic factors having effect on entrance and exit passenger flow of rail transit stations: weather data, atmospheric temperature data, holiday and festival data, ground index data, and elevated road data (based on their history data) and verifies that the prediction error is less than 4.1%, by means of experiment.

Relevant Data Having Effect on Rail Transit
Passenger Flow and Data Reduction The IC card swipe data is the data recorded in April 1st-30th, 2015, and it is saved in the CSV file format, as shown in Table 1.
In Table 1, the station name refers to the Chinese name of bus line or rail transit station, the sector name refers to the bus, rail transit, taxi, ferry, or P + R parking lot, and the transaction property may be nonpreferential, preferential, or none.
The ground index data is provided at an interval of 10 minutes, as shown in Table 2.
The ground index in Table 2 refers to the ground road transportation index.According to the definition at Shanghai transportation and travel website, the road transportation index represents the degree of road transportation operation congestion by using a quantitative method and it is a digital expression of road transportation status.It is expressed as a The elevated road data is provided in two time frames: morning and afternoon, as shown in Table 3.
In Table 3, the region corresponding to the area number indicates the specific elevated road name which is used for matching with the name of the nearest elevated road to rail transit station.The elevated road index in the table represents the road transportation index of elevated road and its value ranges from 0 to 100.It is similar to the ground road transportation index and it reflects the condition of elevated road congestion.
The meteorological data is provided in 3-hour time frames, as shown in Table 4.
Based on the rainfall in Table 4, it is judged whether it is raining.If the rainfall is >0, it will indicate a rainy day; otherwise, it will indicate a nonrainy day.

Data Reduction.
The data reduction means that useful characteristics of the data depending on discovery target are found out on the basis of understanding mining tasks and own contents of the data to reduce the data size and consequently minimize the data volume under the prerequisite of keeping the original condition of data as far as possible.
According to the IC card transaction time and station name, statistics can be prepared at certain time granularity.This paper predicts the entrance passenger flow and exit passenger flow of rail transit station in 1 day.After data reduction, the resultant data is shown in Table 5 (with the Dabaishu station as an example).
In Table 5, the date ranges from April 1st to April 20th, 2015; the atmospheric temperature is taken from Table 4 and represents the daily maximum atmospheric temperature in the corresponding area of rail transit station; the weather is judged on the basis of rainfall: 1 indicates a nonrainy day and 2 indicates a rainy day; the "working day" is judged by means of calculation based on the date: 1 indicates a working day and 2 indicates a nonworking day; the ground index is taken from Table 2: ground index data; the elevated road index is taken from Table 3: elevated road data; the number of passengers entering the station and the number of passengers exiting the station represent the statistics on number of passengers based on the station name in Table 1: IC card swipe data: if the amount is 0, it will indicate entrance into the station; otherwise, it will indicate exit from the station.The reduced data in Table 5 is used as the input data, labeled data, and test data for the prediction by deep learning method.

Brief Introduction to Deep
Learning.Since 2013, the big data has gradually prevailed in researches in the area of intelligent transportation.The application of deep learning especially [9] makes the artificial intelligence based on big data become possible.The concept of deep learning originates from the research on artificial neural network.The multilayer perceptron neural network containing multiple hidden layers is a deep learning structure.The multilayer neural network has three or more network layers.In other words, it includes at least input layer, one or more hidden layer, and output layer [10,11].The deep learning solves two problems in conventional multilayer neural networks, that is, local optimal solution and algorithm overfitting, so that the training of neural network can achieve global convergence to obtain the optimal solution.From the view of statistics and calculations, the deep learning is especially suitable for processing big data and it improves the accuracy of statistical estimation by means of big data.

Deep Learning Method Selection and Parameter Optimization.
The deep learning neural network used in this paper is divided into 3 layers, wherein the first layer is the input layer which includes 5 nodes, respectively, corresponding to the data of atmospheric temperature, weather, working day or not, ground index, and elevated road index; the second layer is the hidden layer and the cut-and-try method is applied to select the optimal number of nodes for this layer; the third layer is the output layer which includes 1 node corresponding to the passenger flow (the entrance passenger flow and exit passenger flow are, respectively, predicted).
For the number of nodes in the hidden layer, the normal calculation formula is shown as follows [12]: where  represents an integer in the range of 1-10.
For the passenger flow prediction in this paper, the cutand-try method is applied and the data of entrance passenger flow and exit passenger flow of Dabaishu station is utilized to determine the optimal number of nodes in the hidden layer.
In view of the Mean Squared Error (MSE), the Mean Squared Relative Error (MSRE) is used for the training error in this paper to evaluate the fitting degree of training.The smaller the MSRE of training is, the higher the fitting degree will become.The calculation formula is shown as follows: ( For the test error, the relative error rate of prediction is utilized to evaluate the accuracy of prediction.The smaller the test error is, the higher the accuracy will become.The calculation formula is shown as follows: Test error

=
|Predicted value − Actual value for test| Actual value for test . ( The number of hidden layer nodes for the cut-and-try method testing is 4-13, and the number of training epochs is 5000.The result is shown in Table 6.
When the entrance passenger flow is used as training data, conduct the same test.The comparison result of selection test of number of hidden layer nodes is shown in Table 7.
With comparison between Table 6 and Table 7, when the selected number of hidden layer nodes is 7, the training error of exit passenger flow prediction is 9.46109 − 05 and its test error is 0.004669; the training error of entrance passenger flow prediction is 6.57961 − 05 and its test error is 0.003632.It is relatively optimal in comparison with any other number of nodes.
In the above-mentioned cut-and-try method test, the number of training epochs is 5000.The number of training  epochs has a very close relationship with the training error, as shown in Figure 1.Generally, the higher the number of training epochs is, the smaller the error will become.In Figure 1, when the number of training epochs is 900, the error already approaches to 0. The number of training epochs used in this paper is 5000.In consideration of machine performance and training time, it is not suitable to use an excessively high number of training epochs.

Passenger Flow Prediction Process and Predictions.
According to the above setting, use the deep learning method for training and calculate the predicted value of passenger flow.The prediction process is shown in Figure 2.
The process in Figure 2 mainly describes the process of predicting passenger flow by deep learning method as put forward in this paper.The module shown in dashed line in the left side means that the relevant history data of passenger flow of rail transit station listed in Table 5 is read from the csv file, the data of 5 dynamic factors having effect on passenger flow is set up as the input data, and the history data of passenger flow is set up as the labeled data.The module shown in dashed line in the right side includes the establishment of deep learning method as follows: respectively, set up the number of nodes in individual layers for the model (5 nodes for the input layer, 7 nodes for the hidden layer, and 1 node for the output layer); define the Mean Squared Error as the loss function; use the Adam Gradient Descent Optimization Algorithm to achieve iterative global convergence; conduct training after the number of training epochs is set to 5000; save the weight and predict the passenger flow upon completion of training.The prediction of daily entrance passenger flow of rail transit stations with application of the above-mentioned process is shown in Figure 3 (with the Dabaishu station as an example).
The curve in Figure 3 April 20th is 18023 persons.The actual entrance passenger flow was 18158 persons.Thus, the prediction error is 0.743%.
Similarly, the prediction of daily exit passenger flow of rail transit station is shown in Figure 4 (with the Dabaishu station as an example).
The curve in Figure 4(a) represents the history record of exit passenger flow of Dabaishu rail transit station in April The above predictions are based on one hidden layer.With two hidden layers, the entrance passenger flow of Dabaishu rail transit station on April 20th is predicted.The prediction result is 18794 persons.The actual entrance passenger flow was 18158 persons.Thus, the prediction error is 3.5%.It is shown in Figure 5.
The above prediction uses one hidden layer.With two hidden layers, the exit passenger flow of Dabaishu rail transit station on April 20th is predicted.The prediction result is 17530 persons.The actual exit passenger flow was 18108 persons.Thus, the prediction error is 3.19%.It is shown in Figure 6.
With three hidden layers, the entrance passenger flow of Dabaishu rail transit station on April 20th is predicted.The prediction result is 17532 persons.The actual entrance passenger flow was 18158 persons.Thus, the prediction error is 3.45%.It is shown in Figure 7.
With three hidden layers, the exit passenger flow of Dabaishu rail transit station on April 20th is predicted.The prediction result is 16595 persons.The actual exit passenger flow was 18108 persons.Thus, the prediction error is 8.36%.It is shown in Figure 8.
The comparison of MSRE of training (with the formula (2)) and test error (with the formula (3)) among one hidden layer, two hidden layers, and three hidden layers is shown in Table 8.As shown in Table 8, as the number of hidden layers increases, the MSRE of training becomes relatively lower.It indicates that the model highly fits the training data.At the same time, the test error becomes relatively higher.It indicates the reduced accuracy of model prediction, that is, the overfitting of deep learning.There is no universal agreement  on how many layers a deep learning network should have.Generally, the number of hidden layers for deep learning is related to the size of data set.The determination of optimal number of hidden layers is included in the key contents of deep learning model exploration and also represents one of current research topics in the industry.At present, the cutand-try method is usually applied to deep learning applications to determine the number of hidden layers.According to the data in Table 8, the deep learning model with one hidden layer achieves a higher accuracy for the data in this paper.The above experimental results show that considering neural networks deeper than one hidden layers was unnecessary for our problem which has only five input variables.Although the network structure utilized in our study with only one hidden layer cannot be regarded as a truly "deep" neural network, it fully exploits the main idea of deep learning methods and can be easily extended once more input variables are provided.
According to the training principle of deep learning, the more the training samples are, the smaller the prediction error will become.A comparison test is conducted with the sample data selected for daily exit passenger flow.The 13-day training samples are utilized in Figure 9 to predict the daily exit passenger flow of rail transit station on April 20th.The prediction result is 17257 persons and the prediction error is 4.700%.It is higher than the 3.42% prediction error in Figure 4. Consequently, it is shown that the less the training samples are, the higher the prediction error will become.When the training samples are excessively insufficient, no prediction is possible.When the 12-day training samples are used for prediction, a failure prompt occurs.
This deep learning model can predict the daily entrance passenger flow or daily exit passenger flow of rail transit station at any day.The passenger flow data of Wuwei rail transit station is used for prediction as follows.After data reduction, the resultant data is shown in Table 9.
In Table 9, the exit passenger flow at Day 14 was 2165 persons and it was obviously different from the exit passenger flow at Day 7, which was 1976 persons.The prediction of exit passenger flow at Day 14 with the deep learning model in this paper is shown in Figure 10.
The prediction result is 2154 persons.The actual daily entrance passenger flow was 2165 persons.Thus, the prediction error is 0.51%.As shown in Table 5, the rainy weather at Day 20 is classified as one abnormal mode for this city (for some cities, the sunny weather is classified as one abnormal  With the passenger flow data of Wuwei rail transit station in Table 9, the multiple linear regression is utilized to, re-spectively, predict the entrance passenger flow and exit passenger flow of Wuwei rail transit station at Day 14.The result of prediction of daily entrance passenger flow is shown in Figure 11.
The prediction result is 2303 persons.The actual daily entrance passenger flow was 2327 persons.Thus, the prediction error is 1.03%.The MSRE of training is 0.001782145.The result of prediction of daily exit passenger flow is shown in Figure 12.
The prediction result is 2196 persons.The actual daily exit passenger flow was 2165 persons.Thus, the prediction error is 1.43%.The MSRE of training is 0.001532426.
The comparison of MSRE of training (with the formula (2)) and test error (with the formula (3)) between deep learning model and multiple linear regression is shown in Table 10.
As shown in Table 10, in comparison with the deep learning model, the MSRE of training of multiple linear regression is higher.In other words, the degree of data fit is low.Thus, the fluctuation of prediction error will be relatively large (additional, in line 1, the number of training epochs is 6000).

Example Verification and Analysis
For the deep learning method put forward in this paper, the Wuwei Road Station in Shanghai Rail Transit Line 11 and the Yanchang Road rail transit station in Shanghai Line 1 are selected at random for daily entrance passenger flow  prediction and daily exit passenger flow prediction.The results are shown in Table 11.
According to the data in Table 11, it could be found that the training error for MSRE of training is lower than 0.000492401.It indicates that the deep learning method in this paper achieves a very high fitting degree.From the view of test error, the values are lower than 0.040324.It indicates that this model has very strong generalization capability and it is suitable for prediction of entrance and exit passenger flow of individual rail transit stations.

Conclusions
This paper puts forward the approach of prediction of daily entrance passenger flow and daily exit passenger flow of rail transit stations by using the deep learning method on the basis of big data of IC card swipe at rail transit stations in Shanghai and in combination with the data of 5 major dynamic factors having effect on entrance and exit passenger flow of rail transit stations.By means of experiment, it is verified that the training error and prediction error and accuracy of the model put forward in this paper are higher than those of currently known prediction methods and its prediction error is lower than 4.1%.The model is suitable for accurately predicting the daily entrance passenger flow and daily exit passenger flow of rail transit stations.For the management of urban public transportation, on the one hand, conduct quantitative analyses on distribution and change conditions of passenger flow by establishing a series of statistical indices to directly provide decision-making bases for operation adjustment; on the other hand, analyze the major dynamic influence factors leading to the occurrence of certain change rule to the passenger flow and use them in the building of passenger flow prediction model for short-term passenger flow prediction to provide passenger flow data support for operation units.With the continuous enrichment of relevant transportation data, the continuous improvement in data of the factors having effect on rail transit passenger flow, and the continuous advancement of artificial intelligence for predicting rail transit passenger flow, the accuracy of prediction of rail transit passenger flow must become increasingly higher in future.

2
value − Actual value of training sample No. ) /Actual value of training sample No. ) Training sample size .

Figure 1 :
Figure 1: Diagram of relationship between training error and epochs.

Figure 2 :Figure 3 :Figure 4 :
Figure 2: Prediction process of daily passenger flow of rail transit station.

Figure 5 :
Figure 5: Prediction of daily entrance passenger flow of Dabaishu rail transit station (2 hidden layers).

Figure 6 :
Figure 6: Prediction of daily exit passenger flow of Dabaishu rail transit station (2 hidden layers).

Figure 7 :
Figure 7: Prediction of daily entrance passenger flow of Dabaishu rail transit station (3 hidden layers).

Figure 8 :
Figure 8: Prediction of daily exit passenger flow of Dabaishu rail transit station (3 hidden layers).

Figure 9 :
Figure 9: Prediction of daily exit passenger flow of Dabaishu rail transit station based on 13-day training samples.

Figure 10 :
Figure 10: Prediction of exit passenger flow of Wuwei rail transit station at Day 14.

Figure 11 :
Figure 11: Prediction of daily entrance passenger flow of Wuwei rail transit station (multiple linear regression).

Figure 12 :
Figure 12: Prediction of daily exit passenger flow of Wuwei rail transit station (multiple linear regression).

Table 1 :
IC card swipe data.

Table 2 :
Ground index data.

Table 5 :
Relevant data of passenger flow of Dabaishu rail transit station.Date Atmospheric temperature Weather Working day Ground index Elevated road index Number of entrances Number of exits Number of hidden layer nodes =  + √ (Number of input layer nodes + Number of output layer nodes)

Table 6 :
Comparison 1 for selection test of number of hidden layer nodes.

Table 7 :
Comparison 2 for selection test of number of hidden layer nodes.

Table 8 :
Comparison among tests with one hidden layer, two hidden layers, and three hidden layers.

Table 9 :
Relevant data of passenger flow of Wuwei rail transit station.Date Atmospheric temperature Weather Working day Ground index Elevated road index Number of entrances Number of exits

Table 10 :
Comparison between deep learning model and multiple linear regression (Wuwei station).

Table 11 :
Training error and test error of prediction on other rail transit stations.