Holiday Destination Choice Behavior Analysis Based on AFC Data of Urban Rail Transit

For urban rail transit, the spatial distribution of passenger flow in holiday usually differs fromweekdays. Holiday destination choice behavior analysis is the key to analyze passengers’ destination choice preference and then obtain the OD (origin-destination) distribution of passenger flow. This paper aims to propose a holiday destination choice model based on AFC (automatic fare collection) data of urban rail transit system, which is highly expected to provide theoretic support to holiday travel demand analysis for urban rail transit. First, based onGuangzhouMetro AFC data collected onNewYear’s day, the characteristics of holiday destination choice behavior for urban rail transit passengers is analyzed. Second, holiday destination choice models based onMNL (Multinomial Logit) structure are established for eachNewYear’s days respectively, which takes into account somenovel explanatory variables (such as attractiveness of destination). Then, the proposed models are calibrated with AFC data from Guangzhou Metro using WESML (weighted exogenous sample maximum likelihood) estimation and compared with the base models in which attractiveness of destination is not considered. The results show that the ρ2 values are improved by 0.060, 0.045, and 0.040 for January 1, January 2, and January 3, respectively, with the consideration of destination attractiveness.


Introduction
Holiday travel demand has obvious characteristics and regularity which is different from weekdays.As passenger flow increases greatly and peak hours are extended during holidays, traffic congestion problems in many cities have become more and more serious.Urban rail transit, as an important component of urban integrated transport system, undertakes more and more person trips.To organize transportation and adjust operation plan effectively, it is necessary to master the origin-destination (OD) flow distribution firstly, especially when subway network changes and operation plan is adjusted.Since OD flow distribution is just the aggregated expression of individual's destination choice result, study on holiday destination choice behavior for urban rail transit does great help to capture characteristics of holiday trips and provides theoretic support to holiday travel demand analysis.
Holiday-related decision making and behavior are important study topics in the fields of transportation and tourism.
The literature focusing on holiday destination choice decisions generally employ the multinomial logit (MNL) or nested logit (NL) model based on the random utility.Hong et al. [1] classified similar destinations to make consideration sets and then proposed a two-stage nested multinomial model in which affective images of the destinations and the individual's constraints were taken into consideration.The results supported the effectiveness of categorization concept and sequential process in the destination choice process.Asakura and Iryo [2] studied a simple index of tourist behavior using tracking data collected with a mobile instrument and then analyzed the topological characteristics of tourist behavior based on the proposed index and actual data collected in Kobe.Lamondia et al. [3] used the unique Eurobarometer vacation travel survey to jointly model travelers' choice of holiday destination and travel mode, while also considering an extensive array of stated motivation-based preference and value factors.van Cranenburgh et al. [4] estimated discrete portfolio vacation choice models based on data obtained in a novel free format stated preference of revealed preference (SP-off-RP) choice experiment in which SP alternatives are constructed by pivoting of late consideration set alternatives.
Apart from holiday destination choice decisions, destination choice on weekdays is also widely studied [5].Bhat et al. [6] analyzed the destination choice behavior on the basis of a household survey conducted in Dallas Fort Worth.The research results were applied in the analysis of land use planning.Sivakumar and Bhat [7] proposed a noncommuter travel destination choice model in which regional size variables (including population, employment, and regional size), trip characteristics, level of transportation service, and psychological feedback were considered.However, those existing destination choice models rarely take account of the actual traffic state.Ye and Wen [8] proposed a new destination choice model based on link flows, and a searching algorithm for an observed link set was presented as well.In addition, many emerging technologies are also used to study destination choice behavior, such as GIS [9] and GPS.Shao and He [10] built a set of reachable destinations according to a space-time prism acquired from GIS data and then selected the destination from the choice set.
In the theoretical level, explaining destination choice behavior with the disaggregate model is better than the traditional four-stage method, for that destinations are discrete alternatives.However, there are still some problems existing in those above researches.First, most researches focus on the whole network of integrated transport system or road network, while they rarely study holiday destination choice decisions for urban rail transit network.Second, current destination choice methods are mostly based on the MNL model or NL model which takes travel destination and mode choice into consideration.However, each OD has several feasible routes and route choice is usually neglected.Besides, plenty of disaggregate data which requires special questionnaire survey is needed to estimate the models.Third, most studies typically focus on factors such as personal attributes [11], trip characteristics [12], and destination characteristics (destination area size, the presence of different kinds of activities, exchange rate, etc.) [13,14], while some destination characteristics still are not taken into consideration, such as attractiveness of destination.Therefore, this study analyzes holiday destination choice behavior based on a disaggregate model, using the data collected by AFC System of urban rail transit, and route choice and more influencing factors are comprehensively taken into account in the proposed model.
The paper is organized as follows.In the first section, the background and significance of holiday destination choice behavior analysis are provided, along with a review of previous work on destination choice decisions.The second section analyzes the characteristics of holiday trips and destination choice preference which are different from weekdays.In the third section, holiday destination choice models are established for each New Year's day, respectively, and some novel explanatory variables (such as attractiveness of destination) are introduced in the proposed models.In the fourth section, based on the data collected from Guangzhou Metro, the models are calibrated and compared with base models (in  which attractiveness of the destination is not considered).The conclusions are then given in the final section.

Characteristics of Holiday Destination Choice Behavior
To analyze characteristics of holiday trips and establish holiday destination choice model, OD flow around New Year's days was collected by AFC system at Guangzhou Metro stations.Guangzhou metro system has seven lines and 123 stations, 12 of them are transfer stations.

Characteristics of Holiday Trips.
Travel demand during holidays has presented different characteristics compared with weekdays.Viewed from the temporal distribution of OD flow firstly, OD flow of urban rail transit network changes greatly around the holiday, as presented in Figure 1.
As shown in Figure 1, network OD flow significantly increases on December 31 and January 1, while it gradually decreases on January 2 and January 3.This is because most people take earlier stage of the holiday for entertainment and the last day for rest.In addition, peak hours are extended owing to the fact that there is little space-time limitation during the holiday.
Then, from the spatial distribution of OD flow, OD flow is distributed more widely during holidays than weekdays.In order to express the spatial distribution dispersion, TDD (trip distribution dispersion) is defined as follows: where TDD is the trip distribution dispersion, between 0 and 1; the larger the TDD value is, the more widely OD flow distributes;   is the average OD flow of all OD pairs, person times/day, and  nz is the average OD flow of nonzero OD pairs, person times/day.The values of TDD for New Year's days are calculated in Table 1.
As shown in Table 1, TDD values on New Year's days are higher than that on workday, which illustrates that people travel more widely on holidays.Besides, more people tend to take a trip on the first day of holidays for that TDD value on January 1 is the highest.

Holiday Destination Choice Preference.
Apart from the characteristics of holiday trips mentioned above, holiday destination choice preference has also greatly changed.For most people prefer to go to tourist attractions and shopping centers for entertainment, only fewer people take office areas as their destination.In order to illustrate the destination choice preference, Kengkou Station which is in the residential area is taken as the origin station and then some typical OD pairs are selected, as shown in Figure 2.
As presented in Figure 2, passenger flow of the destinations in the office area (such as Nongjiangsuo and Haizhu Park Station) significantly decreases on New Year's days.However, the destinations near tourist attractions and shopping centers (such as Changshoulu, Gongyuanqian, Tiyuxilu, and Yuexiu Park Station) are faced with higher passenger flow.Besides, Fangcun Station in residential area also faces higher passenger flow, which shows that the holiday is used to visit friends and relatives as well.In addition, passenger flow of Guangzhou Railway Station increases too, owing to the fact that some people prefer to take a long journey during the holiday.Moreover, passenger flow of Changshoulu and   Gongyuanqian Station is greatly higher than Fangcun and Guangzhou Railway Station, which indicates that the holiday is mainly spent in shopping and sightseeing.
In addition, depending on the nature of the destination, the rule of passenger flow will vary, as shown in Figure 3. Shopping centers and tourist attractions (such as Changshoulu and Yuexiu Park Station) reach their peak between 10 : 30 and 16 : 30, while the peak of railway stations (Guangzhou Railway Station, for example) appears earlier, between 8 : 00 and 17 : 00 usually, and the passenger flow decreases a little between 12 : 30 and 13 : 00.Then, residential areas (Fangcun Station, for example) have a morning peak and an evening peak, but the morning peak is postponed.

Model Development
Holiday destination choice decisions are generally analyzed based on disaggregate model, such as MNL or NL.However, the disaggregate data for disaggregate modeling usually needs a special questionnaire survey, which calls for a lot of time and manpower.On the other hand, there is usually a large amount of aggregate data such as OD flow and operation information collected automatically by AFC system for urban rail transit.Therefore, based on the concept of the representative individual, in which passengers with the same origin and destination are recognized to make the same choice on destination and treated as a group of persons with homogeneous personalized attributes, the aggregate OD flow data is transformed to disaggregate data and applied to the disaggregate model's estimation in this paper.

Model Methodology.
Holiday destination choice models are generally derived from random utility theory.According to random utility theory, passengers try to maximize the utility   of choosing destination  from the destination choice set   .The utility   is expressed as a random variable consisting of a deterministic component   and an additive random error term   as follows: where   is the generalized cost of destination ;  is a positive parameter, related to perception residual, greater than 0, equal to 1 as usual;    is the value of destination attribute ; and   is the corresponding parameter of attribute .
Destination choice decision generally includes destination choice and route choice, and route choice is made conditional on the selection of a specific destination.Through the stated preference (SP) survey data, above 50% passengers usually would chose the shortest time route.Therefore, the level of service (LOS) of the shortest time route for a specified OD pair is used as LOS variables of the OD for simplicity, and then the destination choice model based on MNL structure is established as follows: where   is the destination choice probability.
To estimate the destination choice model, as mentioned above, the aggregate OD flow data need to be transformed into a disaggregate form because of the inefficiency of aggregate data in the Berkson and Theil's method [15].Moreover, there are too many destination alternatives (123 alternatives) to be dealt with in a discrete model.In a large choice set, it has been proved that the model parameters may keep consistent when calibrated with a subset of alternatives (Ben-Akiva and Lerman [16]).Therefore, using choice-based sampling method, the destination alternative set includes the actual chosen destination choice model and 9 additional destinations randomly selected from the set of nonchosen destinations [17], which may reduce complexity of the model and ensure accuracy of parameter calibration as well.
Moreover, weighted exogenous sample maximum likelihood (WESML) [18] estimation is generally applied to calibrate model parameters in accord with choice-based sampling.A weight factor () is introduced into the likelihood function for the aggregate data set to remove the bias between the sample and population data.It can be expressed as follows: where () is the value of log-likelihood function,   = 1 if the result of choices is destination  and 0 otherwise, () is the population proportion for group , () is sampling proportion for group , trip() is the OD volume, the variable trips is the total OD volumes of network and the variable rows is the quantity of all OD pairs.

Explanatory Variables.
Although personal characteristics influence destination choice decision as well, for it cannot be obtained from AFC system, three types of explanatory variables, that is, accessibility of the destination, matching degree of each OD pair, and attractiveness of the destination, are considered in the proposed model.Accessibility of the destination is represented by the attributes of the feasible route between the OD pair, and the attributes such as in-vehicle travel time, transfer time, and the number of transfers are taken into account.It can be expressed as follows: where AL  is the accessibility of destination ;   is the invehicle travel time between the OD pair, ℎ; TT  is the transfer time, ℎ; NT  is the number of transfers, times; and  1 ,  2 , and  3 are coefficients.
Matching degree of an OD pair expresses common influence of the origin and destination attribute on destination choice behavior, including the land use type, the land use intensity, and the same line indicator (which indicates if the origin and destination stations locate in the same line).As for the land use type variable, all stations are classified into different types firstly using fuzzy clustering method.The three factors are all treated as dummy variables as follows: As regards attractiveness of destination, correlation degree is defined firstly.Correlation degree is expressed by the number of the origin stations whose passenger flow to the destination is larger than 50 person times a day.The greater the correlation degree is, the more attractive the destination will be.In order to get correlation degree of new subway stations, the relationship among the correlation degree, trip generation, and trip attraction of the stations is analyzed.There is a significant correlation between correlation degree and trip generation, as presented in Table 2. Therefore, correlation degree of new stations can be obtained through trip generation of these stations.In addition, trip attraction of a destination also represents the attractiveness of the destination to some extent.Passengers tend to choose the destinations which are heavily chosen by other passengers.Obviously, the attractiveness of destination is closely related to land use type and intensity around stations.So attractiveness of a destination can be expressed as follows: where AT  is the attractiveness of destination ; CD  is the correlation degree of destination ; and  1 ,  2 are coefficients.
To sum up, considering the above factors, the generalized cost function can be expressed as follows: where   is the generalized cost of destination  for passenger .
Finally, the results of multicollinearity test for those variables are summarized in Table 3.
As shown in Table 3, there is a lightly linear relationship between the variables transfer time and the number of transfers.However, since the transfer time value is affected by the service level of transfer station's facilities and related to metro operation plan, it usually varies at different transfer stations.Namely, the variables transfer time and the number of transfers, respectively, play important roles in the destination choice process.In order to evaluate the impact of the two variables, transfer time and the number of transfers on destination choice behavior, the two variables are both still taken into the proposed models.

Results and Analysis
Based on the data collected by AFC system in Guangzhou Metro, using the WESML method, the destination choice models for the New Year's days (i.e., January 1, January 2, and January 3) are estimated, respectively.The results of models estimation are summarized in Table 4.
According to Table 4, all of the estimated coefficients for the proposed models are consistent with their hypothesized effects on the utility.Coefficients of in-vehicle travel time, transfer time, and number of transfers are all negative, which indicates that the values of these variables are inversely proportional to destination choice preference.Besides, coefficients of land use type, land use intensity, the same line indicator, correlation degree, and trip attraction are all positive, which is consistent with common sense.In addition, the parameters calibration results show that transfer time and correlation degree significantly affect the holiday destination choice.A destination with shorter transfer time and higher correlation degree is more likely to be selected as a travel destination.
Also observed from the results in Table 4, all of the parameter estimates are statistically significant at the 95% level on account of the absolute -values of the variables in excess of 1.96.Besides, the  2 values are all greater than 0.2, which implies a good fitness for the models estimation.Therefore, the estimations are reasonable.
In order to indicate the improvement in goodness of fit brought by attractiveness of destination factors, the base destination choice models, in which correlation degree and trip attraction variables are not considered, are established for comparison.The results of parameters calibration for the base destination choice models are shown in Table 5.
As shown in Table 5, although all of the parameter estimates have the right sign and are statistically significant at the 95% level, the  2 values of the three base models are all lower than 0.2 for the absence of attractiveness of destination.This strongly indicates that the proposed models with attractiveness of destination variables are superior to the base model in terms of goodness of fit.Compared with the base destination choice models, the  2 values of the proposed models are improved by 0.060, 0.045, and 0.040 for January 1, January 2, and January 3, respectively, with the introduction of attractiveness of destination variable.

Conclusions
Through analyzing destination choice behavior for urban rail transit passengers on holiday, it is clear that people prefer to go to tourist attractions and shopping centers for entertainment rather than office areas, which is obviously different from the situation of weekdays.The holiday destination choice models for urban railway transit are proposed, in which, route choice, accessibility, matching degree of each OD pair, and attractiveness of the destination are comprehensively taken into consideration.Based on the assumption of representative individual, the aggregate OD flow data collected by AFC system on New Year's days in Guangzhou Metro is transformed to disaggregate data and applied successfully to the disaggregate model's estimation.Furthermore, choice-based sampling method is used to establish the destination alternative set, and then WESML estimation is applied to calibrate model parameters.The results demonstrate that the attractiveness of the destination has an obvious influence on individual destination decision making, on account that, with the introduction of attractiveness of destination variable, the  2 values are improved by 0.060, 0.045, and 0.040 for January 1, January 2, and January 3, respectively, compared with the base destination choice models.In other words, the attractiveness of destination helps explain the destination choice process precisely.Passengers tend to choose the destinations whose attractiveness of destination value is greater.As mentioned before, the attractiveness of destination is closely related to land use type and intensity around stations.People prefer to go to tourist attractions and recreational places for entertainment on holidays.Therefore, with regard to metro operation organization, operational departments should pay great attention to the stations near tourist attractions or recreational places and take necessary measures to evacuate passenger flow on holidays.Besides, as for urban planning, the capacity of metro stations should be taken into account when making a layout plan of recreational places, in case of overcrowding in some stations.

Figure 1 :
Figure 1: OD flow of Guangzhou Metro around New Year's days.

Figure 3 :
Figure 3: Passenger flow's temporal distribution of typical destinations on January 1.

Table 1 :
TDD values for New Year's days.Figure 2: Change Rate of OD passenger flow compared with workdays.
is the matching degree of   and   ; UT  , UI  , CL  represent the land use type, the land use intensity, and the same line indicator of   and   , respectively;   is the passenger flow between   and   , person times/day; TA  is trip attraction of   , person times/day;  is the number of stations whose land use type is the same with   ;  is the number of stations whose land use type is the same with   ; TG  is trip generation of   , person times/day; and  1 ,  2 ,  3 are coefficients.

Table 2 :
The correlation between correlation degree and trip generation.

Table 4 :
Estimation results of holiday destination choice models.

Table 5 :
Estimation results of the base destination choice models.