Forecasting Beijing Transportation Hub Areas’s Pedestrian Flow Using Modular Neural Network

Along with the increasing proportion of urban public transportation trip, pedestrian flow in transportation hub areas increased. For effectively improving the emergency handling ability of related management apartments and preventing the incident of pedestrian congestion, this paper studied the method of pedestrian flow forecast in Beijing transportation hub areas. Firstly, 34 typical sidewalks in Beijing transportation hub areas were surveyed to obtain 2200 valid data. Secondly, correlation analysis was used to analyze the relationship between pedestrian flow and its influential factors. 11 significant influential factors were extracted. Thirdly, forecastingmodelwasestablishedwithmodularneuralnetwork.Thesurveyedpedestrianflowsamplewasfuzzyclusteredaccording totheregionallandusewherethetransportationhubexisted.Then,membershipfunctionbasedonthedistancemeasurewas constructed.Throughfuzzydiscrimination,onlineselectionforthesubnetworkoftheinformationcanbeachieved.Consequently, theself-adaptationoftheneuralnetworkoninformationprocessingwasimproved.Finally,thispapertestedthepedestrianflow sampleofatransportationhubinBeijing.Itwasconcludedthattheaccuracyofpedestrianflowforecastingmodelusingmodular neuralnetworkwashigherthanotherneuralnetworkmodels.Therewasalsoimprovementintheadaptabilitytoenvironment.


Introduction
Recently, along with the pushing of economic development and urbanization in China, traffic congestion tended with the trend from site to line and from part to expansiveness.The related researches suggested that developing public transportation is one of the best means to solve urban traffic congestion.As a result of the rapid development of public transportation and the improving of people's environmental awareness, more and more people select public transportation as their first choice for traveling.Moreover, pedestrians in transportation hub area played an important role in affecting traffic safety and the forecast of pedestrian flow is the premise for management.Thus, research on forecasting pedestrian flow in transportation hub area is beneficial on improving the information awareness ability and emergency handling ability of related management apartments.It has an important significance on safe running of transportation hub and alleviating urban traffic congestion.
The existing researches were studied on the characteristics and forecast of pedestrian flow.Gipps and Marksjö [1] established behavior model and early route choice model of a single pedestrian on the basis of the shortest path as their moving.Hänseler et al. [2] established a cellular automaton model on pedestrians' route around railway stations based on the basic characteristics of pedestrian flow.Zhao et al. [3] surveyed on pedestrians volume in a supermarket in Harbin and fitting for a regression formula of three ordered polynomials for pedestrian flow destiny in weekdays and weekends.Li [4] established forecast models of passenger distribution and daily traffic using the passenger flow data of World Expo held in Osaka, Japan, in 1970 as main reference indicator.They also drew a curve of the passenger flow distribution.Cetiner [5] proposed an optimism dynamical neural network and on this basis built traffic flow forecast model.Marfia and Roccetti [6] put a new forecast model into the research of the forecast of traffic congestion and examined its effectiveness with the actual results in different urban roads.Xie et al. [7] proposed a hybrid temporal-spatio forecasting approach which combined temporal forecasting based on radial basis function neural network and spatio forecasting based on spatial correlation degree to obtain the passenger flow status in high-speed railway transport hub.Davidich and Köster [8] present a methodology that identifies key parameters and interdependencies that enable them to properly calibrate the model against relevant real-life data to make it capable of reproducing and predicting real-life scenarios based on the analysis of real-life data.As analyzed above, the existing researches on pedestrian have paid much attention on pedestrians' behavior characters and passenger forecast in commercial regions and large event places.The researches on forecasting pedestrian flow were rarely.Moreover, pedestrians are highly intelligent bodies.Pedestrian flow is different from the motor vehicles which can only run on the stipulated lanes.It has the characteristics of nonlinearity and discreteness.Pedestrians can change their moving conditions according to the environment.
Aiming at solving the questions above, this paper proposed a pedestrian flow forecast model based on a modular neural network.According to the learning ability and the approximation capability of the neural network itself, the model has high precision and strong adaptive capacity to the environment.The effectiveness of the model was proved through the test on pedestrian flow survey sample in Beijing transportation hub areas.

Extraction of Significant Influence Factors
Pedestrian flow in transportation hub areas is related to the land use of the transportation hub existing area, the effective width of sidewalks, the proportion of reverse direction pedestrians, nonmotor vehicle flow, and so on.The change of climate, on-street parking, and so forth also had certain influences.
In order to reflect the situation of pedestrians in Beijing transportation hub areas and extract significant influential factors on pedestrian flow there, this paper selected 34 survey sidewalks which covered different road facilities and environmental conditions around the typical subway station in Beijing.44 different dynamic traffic condition data were obtained through the control survey at different buckets.The relationships between pedestrian flow and each factor were examined with correlation analysis method.Pearson correlation coefficient of the correlation analysis method is an effective method which is the most commonly used to test the relative degree between variables.It can be calculated as follows: where   is the th pedestrian flow;   is the variance of pedestrian flows;  is the average of pedestrian flows;   is the th influential factor;   is the variance of influential factors;  is the average of influential factors.
is in the range of −1 and +1.If  > 0, it means that the two variables are positive correlation.If  < 0, it means that the two variables are negative correlation.The higher the absolute value of , the stronger the relationship is.But there is no causal relationship.If  = 0, it means that the relationship between the two variables is not linear.There may be some other relations.
The correlation coefficients between pedestrian flow and each influential factor were calculated in Table 1.The Pearson correlation coefficient value of each significant influential factor at the 95% confidence interval was shown here.
As is analyzed above, 11 factors such as regional land use, effective width of sidewalks, and proportion of reverse pedestrians strongly affected pedestrian flow.Thus, the 11 factors were selected as analytical conditions to establish pedestrian flow forecast model.The model can be expressed as follows: where  is pedestrian flow;  1 is regional land usage;  2 is effective width of sidewalks;  3 is proportion of reverse pedestrians;  4 is type of buffer;  5 is on-street parking;  6 is isolation between nonmotor vehicles and motor vehicles;  7 is greening;  8 is inside motor vehicle flow;  9 is building facilities;  10 is distance between pedestrian and nonmotor vehicles;  11 is distance between pedestrian and motor vehicles.
The formula of pedestrian flow-time series forecast model based on chaotic time series [9] is as follows: where   is pedestrian flow at time ;  is delay time step; (> 0) is prediction horizon.The essence of pedestrian flow-time series forecast formula is foreseeing the change of pedestrian flow based on the present moment and the previous information.

Integrated Method of Modular Neural Network
There are many influential factors in Beijing transportation hub area affecting pedestrian flow.The normal empirical model has certain limit in scope of application.When exceeding the scope, adoptive ability of normal empirical model is poor.However, neural network has powerful nonlinearity approximation capability and self-learning ability.It has been widely used in nonlinearity system modeling field.Thus, this paper used a neural network tool to establish pedestrian flow model."Urban land classification and planning construction land standard GB50137-2011" classified urban construction land into residential land, public management and service land, commercial and service facility land, industrial land, logistics storage land, traffic facility land, public facility land, and green land.The single neural network will cause high complication of the model and decrease the learning ability and generalization ability of the neural network.Modular neural network (MMN) used the theory of "divide and conquer strategy" to divide a complicating problem into several subproblems and structured local network to each subproblem.As a result, the complication of each local network was simplified and the learning ability and generalization ability of the neural network were improved.The structure of modular neural network was shown in Figure 1.
The main output of the modular neural network was where  is the output of the neural network;  is the classification number of sample set;   is weigh of the output of local network;   is the output of local network  (NET  ).
According to the principle that it is similar character of the pedestrian flow in similar land use areas, this paper classified pedestrian flow based on the regional land use to achieve the purpose of task decomposition.Firstly, the feature vector of land usage was established as Z = [RL, PMSL, CSFL, IL, LSL, TFL, PFL, GL]  , where RL is residential land; PMSL is public management and service land; CSFL is commercial and service facility land; IL is industry land; LSL is logistics storage land; TFL is traffic facility land; PFL is public facility land; GL is green land.If S = { 1 ,  2 , . . .,   } was the observed sample set of pedestrian flow,  was the number of sample, and the set S can be fuzzy clustered according to the feature vector Z based on formulae ( 5) and ( 6) with the consideration of land usage.One has Moreover, where S = {S 1 , S 2 , . . ., S  };   is clustering center of S  ; v = (V  ) is the membership matrix of sample to a fuzzy subset.In [4],  = 1, and in [10],  = 2.Some data may simultaneously belong to sample subsets S  and S  .After that, the sample in each classification was trained in the local network that it belongs to.BP (backpropagation) learning arithmetic [11] was used to train the network.
For the new test data   ∉ S, the local network which it belongs to should be distinguished first.Distance measure can present the similarity of the two feature vectors.Thus, the distance measure of   and V = [V 1 , V 2 , . . ., V  ] is chosen as a criterion to decide which local network to belong to.The calculation of distance measure was shown in formulae (7) and (8).Consider the following: in which, where   was the relative distance measure of   and NET  ;   is the average distance measure of th sample subset;   is the number of data for th sample subset.The traditional modular neural network just selected the smallest local network of   to handle   .This method had a low precision in dealing with boundary samples.Consider with the fuzziness of the classification of urban land use, multiappropriate local networks joining in the handling    can achieve the effect of "brainstorming" and then the error can be reduced.Thus, this paper proposed a method of selecting local network based on fuzzy decision and achieved the integrated processing to the information from multilocal networks in order to improve the accuracy of the neural network.
First, the relative distance measure of   among all local network normalization can be calculated using Obviously,   ∈ [0, 1] and ∑   = 1.
Defining fuzzy set of distance measure A = {very small (denoted by VS), small (denoted by ), middling (denoted by ), large (denoted by )}.The value of   reflects the distance measure between   and NET  .If   ∈ {VS}, then local networks NET  should be used to train   for the distance of   and NET  is short.If   ∈ {}, then local network NET  cannot be used to train   for the distance of   and NET  is long.
The calculation of local networks of selection involves the following steps.
Step 2. Calculate the membership degree of   to each fuzzy subset A with the membership degree curves as Figure 2 shows.
Step 3. Make a choice which fuzzy subset   belongs to by the highest membership degree in the fuzzy command, and local networks which belong to the same set are chosen to integrate from VS to .At the beginning, the local network in which   ∈ {VS} is selected to be trained and the others are abandoned.If the local network in which   ∈ {VS} does not exist, then it should be used to train the data in which   ∈ {}.Otherwise, select local network in which   ∈ {}.
The center of fuzzy subset to membership degree {VS, , , } is 0.25/, 0.5/, 1/, and 2/ and the value of membership degree is 1 when   < 0.25/,   ∈ {VS} or   > 2/,   ∈ {}.For ∑   = 1, all   cannot meet   > 1/ at the same time.Meanwhile, the local network corresponding to   ∈ {VS, , } would not be the empty set.The local network corresponding to   ∈ {} should not be chosen for selecting a sequence of local network from VS to  in Step 3.
The weight of local network that is selected is given by formula where   is the weight of local network that is selected;   is the distance measure of th local network;  is the number of local networks that has been selected.

Model Experiment Test
In order to examine the effectiveness of the model, this paper selected 34 sidewalks around transportation hub areas to test pedestrian flow.The sample was observation data (including effective width of sidewalks, proportion of reverse pedestrians, and the form of buffer) in each 15 minutes at 14:00-20:00 on April 25th to May 9th in 2013.2400 data in total were obtained.After the quantization of the qualitative indexes such as regional land use, the form of buffer, and onstreet parking, the statistical characteristics of survey data of the 12 index was obtained as showed in Tables 2 and 3. Table 2 showed the quantitative variables and Table 3 presented the qualitative ones.Pedestrian flow was serial numbered according to the observation time order by the former to the latter.The data was classified into 8 groups based on the land use during the data preprocessing.Thus, the parameter  of modular neural network is 8.The local network used multilayered feed-forward neural network with hidden layer nodes of 4 based on the actual test.The delay time step was calculated of  = 6 at the autocorrelation method.
Modular neural network model and the BP neural network passenger flow forecast model [12] both used pedestrian flow-time series method to establish a forecast model.Thus based on the methods above, the monitoring sample was trained and the accuracy of modular neural network model was examined.BP neural network pedestrian group flow forecast model mainly aimed at the region of commercial and service facility land use.6 flow data in the former hour were used to forecast the one in the next one hour after 10 minutes.
The model was tested with 300 pedestrian flow monitoring samples in the region of commercial and service facility land use.The relative error curves of modular neural network (MNN) model and BP model were shown in Figure 3.
As is known in Figure 4, in the case of only considering commercial and service regions, the mean square error precisions for forecasting pedestrian group flow in a modular neural network model and BP neural network were both at the same level.The average relative error and the variation  tendency are also nearly the same.The reason was that the two models both used the BP algorithm to train the model.In the case of the same data, the error and tendency of the forecast results are also same.
Then the test on all the pedestrian flow monitoring samples in the whole monitoring cycle was held with modular neural network model and BP neural network model.The obtained curve of pedestrian group flow forecast model was shown in Figure 4.
The relative error curves of the two models were shown in Figure 5.
The mean square error and the average relative error of pedestrian group flow forecast model from modular neural network (MNN) model and BP neural network model were calculated and their results were shown in Table 4.   From Table 4, we can obtain the following information.
(1) Neural network showed the best application in the model of forecasting pedestrian flow.The accuracy of the model is high no matter which method it takes.(2) The accuracy of modular neural network forecast model is higher than the one of BP neural network pedestrian group flow forest model.In the case of total regional pedestrian flow forecast, the mean square error and the average relative error from modular neural network model are, respectively, 6.56% and 86.02% as many of the ones from the BP neural network model.(3) With the increasing of the sample, the accuracies of the two models Based on the examination above, it can be seen that the accuracy of the forecast model of pedestrian flow in Beijing transportation hub area with the method of modular neural network (MNN) is higher than the one with BP neural network pedestrian group flow forecast model.Moreover, the traditional neural network model had a poor fitting situation if pedestrian flow exceeded the certain limit.Yet the fitting degree of the displacement value forecast model from modular neural network can stay at a high level and the model has a well-adaption.

Conclusions
(1) Modular neural network classified pedestrian flow according to the similar character with modular neural network.It can meet the requirement of pedestrian flow forecast and deduce the complication of the neural network and then the generalization ability of the neural network is enhanced.This paper established pedestrian flow forecast model for Beijing transportation hub area based on the 11 factors which were regional land use, effective width of sidewalks, proportion of reverse pedestrians, and so on using modular neural network method.The accuracy of the model was improved to be higher than BP neural network pedestrian group flow forecast model.
(2) Because of the learning ability and data driven features of the neural network itself, the model had a well-adoption on different environment.Along with the increasing of the samples, the adoption of modular neural network to the changing environment can be further improved with the mode of learning regularly.Then the application range of the model can be expanded.

Figure 1 :
Figure 1: The structure diagram of a modular neural network (MNN).

Figure 3 :
Figure 3: The curves of modular neural network (MNN) model and BP model.

Figure 5 :
Figure 5: Test error curves of modular neural network (MNN) model and BP model.

Table 1 :
Correlation analysis between pedestrian flow and influential factors.

Table 2 :
The statistical characteristics of the quantitative survey data of the 12 index.

Table 3 :
The statistical characteristics of the qualitative survey data of the 12 index.

Table 4 :
The test error of modular neural network (MNN) model and BP model.However, the accuracy of the mean square error from the modular neural network model was improved by two levels and the one from BP neural network model was improve by one level.It means that modular neural model is more adapted than BP neural network model.