A Markov Chain Based Demand Prediction Model for Stations in Bike Sharing Systems

Accurate transfer demand prediction at bike stations is the key to develop balancing solutions to address the overutilization or underutilization problem often occurring in bike sharing system. At the same time, station transfer demand prediction is helpful to bike station layout and optimization of the number of public bikeswithin the station. Traditional trafficdemandpredictionmethods, such as gravity model, cannot be easily adapted to the problem of forecasting bike station transfer demand due to the difficulty in defining impedance and distinct characteristics of bike stations (Xu et al. 2013).Therefore, this paper proposes a prediction method based on Markov chain model. The proposed model is evaluated based on field data collected from Zhongshan City bike sharing system. The daily production and attraction of stations are forecasted. The experimental results show that the model of this paper performs higher forecasting accuracy and better generalization ability.


Introduction
Bike sharing systems are in place in many cities in the world and are an increasingly important support for multimodal transport systems [1,2].The imbalance between production and attraction from stations is one of the greatest problems in practical system operation at present [3], thus making users unable to rent or return bikes for a time, hindering systems to normally operate, and limiting further promotion.Currently, the main solution to solve the imbalance of system is proceeding to dispatch bikes inefficiently among stations.Accurate demand prediction can offer a beneficial guide for managers to plan and design purposefully and thus would help to solve the imbalance between production and attraction from stations.
Bike sharing system consists of bikes, roads, and fixed stations, whose demand has clear deference with motor vehicle and private bike.The demand forecasting model of motor vehicle and private bike can hardly be adapted to bike sharing system [4].Therefore, it is increasingly important to explore reasonable demand forecasting model which is matching sharing bike's feature, but there are few studies regarding its demand and demand prediction now.
A full understanding of demand is a crucial step to improve the prediction accuracy.Different bike sharing systems may be divergent; nevertheless, significant influence factors are the same, such as lanes, population, economic and social conditions, festival, workday, weather, and land use [5][6][7][8].Different factors in different time periods cause different influence degree.Generally, demand influence factors like lanes, population, and economic level can be seen as constant in a day or smaller time unit.Compared to the overall usage of bike sharing system, the station's demand has its own particularity.Apart from considering factors influencing overall demand, we also need to keep an eye on station's feature.Public bikes move among stations; the amounts of production and attraction in a station are closely related to other stations.An inevitable constraint exists among stations' traffic volume rarely appearing in other researches at present [9].
Easily operated and effective regression model is the main method to forecast the usage of bike sharing system at present, considering important demand influence factors, such as population, weather, workday, land use, and environment [10,11].Importantly, using regression model to forecast demand, comprehensive understanding of influence factors is the key to improve the prediction accuracy.Single regression model can hardly adapt the demand prediction of bike sharing systems, so a variety of multiple regression models to forecast, respectively, can help to obtain optimal results [12].Apart from regression models, other methods including Fuzzy Inference Mechanism [13] and hybrid model [14] also have good application to forecast the whole demand of system.It is worth noting that the hybrid model represents an important development direction of bike sharing demand forecasting methods, which may eliminate inherent defects of single model and withhold advantages of various models.
Above is the summary of the whole demand forecasting of system, which can provide invaluable references to station-level demand prediction.Stations are the basic unit of bike sharing system; station-level demand impacts the system's planning, design, and operation directly.However, few studies focus on the station-level demand forecasting.There exists an obvious difference between the whole demand of system and station-level demand, due to the station's particularity and constraint among stations.So the method of the whole demand prediction cannot suit well station-level demand forecasting; more efforts should be taken to find the reasonable prediction methods of station-level demand.
Traditional station-level demand forecasting methods are still mainly based on regression models, which fully consider influence factors [15].However, few station-level demand predictions used by regression model can consider the traffic usage constraints among stations.Some scholars amend this problem, for instance, Rixey R added station significance to the set of variables and built up a regression model to forecast station's demand of three American bike sharing systems [16].However, even if most factors have been considered, there is still inherent flaw in using regression model to forecast station's demand: the traditional method takes traffic zone as forecast unit, but one traffic zone may contain several stations, which generates obstacles to forecast every station's demand precisely.To solve this problem, some scholars establish ARIMA [17] and modified ARIMA [18] to forecast station's demand; their results show these new methods have a good predictive accuracy.At the same time, Bayesian network has also been used to forecast the station's demand.These new methods make contributions to the station's demand prediction, but their effectiveness needs more tests to be explained [19].
Please note that the lack of standardized evaluation procedure (data, duration, error metric, etc.) forbids doing a fair comparison between them.Table 1 summarizes the main studies in bike sharing prediction, helping readers to grasp the research status.
Through above analysis, we can conclude that regression model is an ideal method to forecast the whole demand of bike sharing system.However, regression model is not very suitable for station-level demand forecasting.In addition, it is uncertain whether the other methods, like ARMIA, Bayesian network, and so on, can work well for different bike sharing systems.
The main purpose of this paper is to build up a reasonable and efficient station's demand forecasting model, which considers most significant influence factors and the constraints among stations.Therefore, we put forward a hybrid model based on Markov chain and have a test to inspect it.

Methodology
2.1.Discrete-Time Markov Chain Model.Markov process is widely used in modeling the dynamics of stochastic systems and the state transitions of complex stochastic systems.A Markov process {(),  ∈ } is a stochastic process with the property that, given the system state at time , (), for a time  > , the system state () is not influenced by the system states, () for  < , that is, prior to the time . is a transfer matrix; the matrix elements are not negative, and the sum of all the various elements is equal to 1, expressed in the probability of each element.The element in the matrix is the probability that the bicycle will be retained, acquired, or lost. () represents the  step transfer matrix.
In other words, the probability of any particular future behavior of the process, if its current state is known, will not be altered by any additional knowledge concerning its past behavior.A discrete-time Markov chain is a Markov process whose state space is a finite or countable set with time interval indexes as  = (0, 1, 2, . ..).In formal terms, the Markov property is that The probability of  +1 being in state  given that   is in state  is called the one-step transition probability and is denoted by Multistep transition probability can be calculated according to one-step transition probability and the Markov property as follows: If the transition matrix  is irreducible and a periodic, the -step transition matrix converges to a stationary distribution  with each column different; that is, lim According to Markov chain properties in finite state space, stationary distribution  can be calculated by (5): Markov chain is a special case of Markov process defined as follows.
The conditional probability of a future state with respect to a past state can be written as follows: Then, {(),  ∈ } is Markov process.The proposed demand forecasting model will be based on the Markov chain process by considering the bike transfer matrix at bike stations as the system state.

The Application in Bike Demand
Estimation.The proposed Markov chain model for bike demand estimation will predict the probability of rental and returns at each station.The probability will then be converted into the actual bike transfer numbers based on the total travel numbers of rental bikes.The detailed algorithm is as follows.
Step 1 (predict the whole traffic flow of bike sharing system).This paper will select a variety of effective models to forecast the whole traffic flow of system, like regression model, neural network, SVM, hybrid models, and so on.According to the prediction accuracy of each model, we choose the best one as the final prediction.
Step 2 (construct the bike sharing transfer matrix between each station).An  ×  matrix is first built to store the bike sharing transfer data where  represents the number of stations.The matrix element   represents the number of bikes rented from the station i and returned to the station .
Step 3 (construct the transition probability matrix  for bike rentals (production)).Assume   is an element of the matrix ; then Step 4 (construct the transition probability matrix  for bike returns (attraction)).Assume   is an element of the matrix : Step 5 (calculate of the steady-state probability vector).The model uses the stationary distribution property of Markov chain model; namely, where  represents steady-state probability vector for bike rentals,  * represents steady-state probability vector for bike returns (attraction), and The steady-state probability vector of bike rental and return can be calculated by ( 9) using  and .
Step 6 (predict the production and attraction of stations).
where  represents the whole traffic flow of system we have predicted in Step 1,  represents the vector of stations' predictive production, and ℎ represents the vector of stations' predictive attraction.

Empirical Study: A Case Study from Zhongshan City
3.1.Zhongshan Bike Sharing System.Zhongshan is a medium-sized city in China, with a population of approximately 600,000 and the urban area of about 170 square kilometers.The city's bike sharing system was first launched in 2011.Up to now, it has developed with an inventory of about 7000 bikes, which spreads over 167 stations in the urban area.The bike renting requires users to pay by a fare card that is linked to a user account.However, the bike usage is free in the first hour of a day.The mode share of bikes in Zhongshan is currently at 20%, among which shared bikes occupy 5%.The utilization ratio of Zhongshan bike sharing system has reached 97% and average daily usage frequency is 11,500 times.Being welcomed by users and operated stability of Zhongshan bike sharing system is the important reason why we select it as predictive object.
In general, bike sharing systems will experience an unstable period and a stable period from construction planning to normal operations.Zhongshan bike sharing system was put into operation in 2011.In order to ensure the stability of the system within the predicted time, this paper selects data in 2013 as the training data and forecasts the demand in 2014.Excluding the invalid data and the usage information of administrators, we get 5,645,070 times travelling data, including 365 days' information in 2013 and 224 days' information in 2014.Each data includes station number, time, user card number, bike number, fee, and other information of renting and returning shared bikes.Some useful information can be obtained by mining these data, like the trip matrix, spatial and temporal information of users, and so on.This paper selects 167 public bike stations in Zhongshan urban area as the research object.For the convenience of readers to know the distribution of stations' location and their traffic volumes intuitively, we did a survey of 167 stations' traffic volumes in 2014, as shown in Figure 1.The dot in picture indicates the actual location of the station, and the size of the circle reflects the station's traffic volume.The larger the circle is, the greater the value of traffic volume is.
Figure 1 shows that the distribution of shared bike stations' traffic volume is uneven; particularly the traffic volume of station which locates outside city is generally low.Such distribution situation has caused great difficulties for the station' demand forecasting.

The Prediction of Bike Sharing
System's Demand.The prediction of the whole demand is the first step of station's demand forecasting; the precision of this has a major influence on the final result.The prediction period of this article is shown by day; one prediction is done per day.So, when choosing the demand influencing factors of bike sharing system, we take short-term influence factors as the principal thing.The preliminary influence factors are seasons, holidays, weekends, temperature, rainfall, weather, wind, special case, and so on.In order to avoid the limitation of single model, this paper chooses MLP (neural network), support vector machine, and regression model to predict the daily traffic volume of Zhongshan bike sharing system, respectively.The architecture of MLP is shown below: Input signal   through intermediate nodes (hidden) applied to the output nodes, through a nonlinear transformation, produce output signals  +1 , each sample including the training of the input vector   and expected output , network output value  +1 , and , adjust the coupling strength between the input node (output node) and hidden layer nodes, and, after repeated training, output the minimum error information.This paper uses the relative error as evaluation index of the whole demand prediction and chooses the optimal one from three results of three methods.
The forecasting results of three methods illustrated in Table 2. Table 2 shows the prediction result of regression model has the lowest relative error among these three methods.So, the predictive result of regression model is taken as the ultimate one.
In the multilayer perceptron neural network, three layers of network (1 hidden layer) are set up, the number of nodes is taken as few hidden nodes as possible, the input layer 7 nodes are selected, the output layer is 1 node, and the hidden layer is 5 nodes.The initial weight value of a random generator is designed to generate a random number of −0.5∼+0.5.The minimum training rate is 0.9.The dynamic parameter selection value is 0.7.Allow the error to be generally 0.001∼ 0.00001.The number of iterations is 1000 times.Sigmoid parameter is 0.9.
In the SVM model parameters with RBF kernel, the parameters of the initial options for ", , , ," where  = 0 means choosing the support vector machine (SVM), type SVC,  = 0 means choosing RBF kernel function, the parameter  means punishment coefficient (defaults to 1), in general, the value of penalty coefficient must be moderate, excessively low parameter cannot meet the requirement of the toolkit (e.g., when  = 0, stop running the toolkit), excessive penalty parameter  causes overlearning, and parameter  represents the gamma function value in the kernel function (the default value is 1/, and  is the number of feature attributes of the sample data, and 7 feature vectors are selected here).
Usually initial parameters often cannot guarantee the optimal model of building; therefore, this paper uses grid optimization method in the set interval ( ∈ [1,5],  ∈ (0, 0.14)) for the parameters of support vector machine (SVM) to search  and , the final SVM model parameters for " = 1,  = 0.01." In linear regression model, parameter selection needs to meet the lowest possible residual and few variables, measure is adjusted  (adjusted  square), the daily usage of bike sharing system was predicted by linear regression model, and  is 0.821 and reflects a good imitative effect.
Table 3 lists the main influence factors and the results of regression analysis; there are many factors that can influence the effect of linear regression, but too many factors may reduce the prediction accuracy; therefore, in the experiment, we selected the 15 influencing factors, including weather, temperature, and date, after comparing the influence of different factors, and finally chose the 7 main factors as shown in Table 3.It selected the different variables such as spring and autumn, weekday, temperature, weather, quality, and special events.Special event includes bad weather and the Spring Festival.The prediction of bike sharing system's daily demand in 2014 as shown in Figure 2 includes 224 days' forecasting results, compared with real value.The two curves of predicted values and actual values fit well, reflecting a better prediction.

The Demand Prediction of Stations.
After acquiring the forecasting results of system's daily trip volume, we forecast the production and attraction of each station.To understand the proposed model to predict effects at different times, the date will be divided into two categories: special date and normal date.Special date refers to the date when a particular event occurs, for instance, the Spring Festival or some extreme weather.In this case, the public bike system is abnormal.As for the normal date, it will be divided into a working day and a nonworking day to test the effect of differences in demand forecasting.
In this paper, we use the real data of 2013 as the original input data of  (−1) ; we predict the production and attraction of each site within 2014.1.1∼2014.8.12, which includes 23 special dates and 201 normal dates (139 working days and 62 nonworking days).With the relative error as the predicting evaluation index, we forecast the traffic volume of 167 sites in the city of Zhongshan.
First, we analyze the overall predicting results.Evaluation index is the average relative error for all sites, as shown in Table 4.It shows that the average relative error of special dates under abnormal circumstances is up to 50%.The average relative error within the normal dates is less than 30% for all sites.Prediction accuracy of working days is higher than that of nonworking days.This is directly associated with more commuting of working days.Interestingly, the average relative error of all sites' attraction is slightly lower than the amount of production.It may be interpreted as a user's travel habits.
As an auxiliary illustration, we use data in 2013 as the original data, prediction of 2014 data, and the timespan is one year, which shows the error condition of the prediction results and the original results in the same period.Figure 3 shows the average relative error of daily demand prediction for stations; most error value is about 20%, but some abnormal errors will be higher than 50% and even more.
All the above is the overall analysis of traffic volume prediction.Next, this paper will analyze the prediction of a single site, making readers have a more clear understanding of the prediction effect.We make an average of 224 days for predicting error on a single site and classified statistics, as shown in Table 5.The prediction error distribution of each site's production and attraction is shown in Figure 4. Table 5 directly reflects the fact that the relative error of actual volume and prediction has a negative correlation in this paper.The relative error of the site with a large traffic volume is generally less than 20%, which has a great predicting effect.Combined with Table 5 and Figure 4, the relative error of the site which located in city periphery with a small traffic volume is generally higher than 30%; however, the absolute error of the prediction of these sites is smaller, and the predicting results actually meet the requirements.PP: the proportion of these stations' production in the system; PA: the proportion of these stations' attraction in the system.At present, there is a lack of standardized evaluation system for public bike system, which directly affects the judgment of the results of the forecast.A reasonable and effective evaluation system needs more research.In this paper, the relative error is the main evaluation index and the absolute error is added as a supplementary explanation, which can be more objective for judging the predicting results.

Conclusions and Future Work
Effective demand forecasting is very important for bike sharing system planning and the daily operation management based on the analysis of the recent public bike demand forecasting research.We built a prediction system based on Markov chain model of site demand, outperforming already developed solutions.With the model performance especially in the study of Zhongshan bike sharing system, we find the following conclusion.
First of all, the whole demand and station-level demand have something different in bike sharing system.For the whole demand forecasting, the regression model can work well but cannot satisfy the station-level demand's claim.To ensure good predictions, the station-level demand forecasting needs to consider not only the most significant influence factors, but also the constraints among stations which few papers have noticed.And this paper did it with good prediction result.
Second, traffic flows among stations are very uneven, so the single evaluation index can hardly reflect the complete

Figure 1 :
Figure 1: The distribution of bike sharing stations' cumulative usage in 2013 in Zhongshan.

Figure 2 :
Figure 2: The predictable results of system's daily demand in 2014.
error (b) Prediction error of stations' attraction

Figure 3 :
Figure 3: The distribution of the average relative error of all stations at all days.

Figure 4 :
Figure 4: The distribution of predictive error of stations' production and attraction.

Table 1 :
The summary of public bike demand forecasting studies.

Table 2 :
Error rate comparison of these models.

Table 3 :
The result of regression model.

Table 4 :
The prediction error of different types of date.

Table 5 :
The statistical analysis of stations in different level relative error.