A Joint Modeling Analysis of Passengers ’ Intercity Travel Destination and Mode Choices in Yangtze River Delta Megaregion of China

Joint destination-mode travel choice models are developed for intercity long-distance travel among sixteen cities in Yangtze River Delta Megaregion of China. The model is developed for all the trips in the sample and also by two different trip purposes, workrelated business and personal business trips, to accommodate different time values and attraction factors. A nested logit modeling framework is applied to model trip destination and mode choices in two different levels, where the lower level is a mode choice model and the upper level is a destination choice model. The utility values from various travel modes in the lower level are summarized into a composite utility, which is then specified into the destination choice model as an intercity impedance factor. The model is then applied to predict the change in passenger number from Shanghai to Yangzhou between scenarios with and without high-speed rail service to demonstrate the applicability. It is helpful for understanding and modeling megaregional travel destination and mode choice behaviors in the context of developing country.


Introduction
A megaregion (also known as megalopolis or megapolitan city) can be defined as a chain of roughly adjacent metropolitan areas with strong social and economic linkages [1].Megaregions are rising around the world.America 2050 [2], a program of the Regional Plan Association, has identified eleven megaregions in North America.In Europe, except for some well-known megaregions within a country (e.g., London metropolitan area, Paris metropolitan area), there are even some transnational megaregions, like Blue Banana, being formed across countries [3][4][5].There are also several megaregions being developed in other continents, such as Sydney Region in Australia, Tokyo urban agglomeration in Asia, the megacity of Cairo in Africa, and the megacity of Sao Paulo in South America [6].
With rapid urbanization in China, ten major megaregions (also called urban agglomeration) are being formed in China according to the research supported by the National Development and Reform Commission [7,8].Those regions cover less than 10% of the land but accommodate more than one-third of the population and create more than a half of GDP (Gross Domestic Product) of the entire country.There are also several other megaregions being formed and developed, which will allow more Chinese people to live in urban areas and enjoy urban life.
The formation and development of a megaregion largely depend on its intercity transportation system that needs to be efficient and convenient enough to support highly frequent and intense movements of passengers and freights between cities in the region.Among all those megaregions in China, Yangtze River Delta Megaregion is the largest one in terms of population (about 110 million) and economic size (annual GDP is nearly 10 trillion RMB Yuan).As per "the National Urban System Plan for 2005-2020" compiled by the Ministry of Construction in 2006, this megaregion consists of 16 cities including its central city Shanghai, the highly developed city Suzhou, Jiangsu Province' capital Nanjing, and Zhejiang Province' capital Hangzhou (see Section 4, Figure 1 and Table 1 for detailed information).Among many of those cities, there are not only interconnected expressways but also high-speed rails that can operate trains at a speed of 180 mph.For example, it takes only 1 hour in a high-speed train to travel between Shanghai and Hangzhou, which are more than 100 miles away from one to the other.Such a multimodal intercity transportation system can greatly strengthen connections among cities and thereby improve their mutual collaborations and enhance competitiveness of the entire megaregion.
It is highly desired to choose this megaregion as a typical example to investigate how a multimodal intercity transportation system aids in forming and developing a megaregion by strengthening connections among its cities.It is critical to understand the joint destination and mode choice behaviors of intercity travelers within the entire megaregion.This kind of study can provide valuable references for future transportation planning and development in China and the rest of the world, where new megaregions are gradually being formed and developed.
The remainder of the paper is organized as below.Relevant literature will be first reviewed in Section 2, while the modeling methodology will be detailed in Section 3. Section 4 will introduce the procedure to collect all kinds of data for model development and provide descriptions for the collected data.Model estimation and simulation results will be discussed in Sections 5 and 6, respectively.Finally, conclusions and discussions will be made in Section 7.

Literature Review
Many megaregional areas appear with the development of transportation system that closely connects adjacent cities.In this context, intercity passenger travels are of more concern for researchers.The research on passengers' intercity travel is mainly focused on travel mode choice and destination choice models.
Some researchers conducted travel survey to observe the characteristic of passengers' intercity travels and their preferences on travel choices [9,10].Some studies show that trip purpose affects mode choice [11], and market segmentation is important for model development [12,13].Thus, intercity mode choice models are usually developed by trip purposes.In the literature, two trip purposes, business and tourism, are usually differentiated for mode choice model development.Dong et al. developed a fractional multinomial logit model to analyze mode choice of intercity business trips in Yangtze River Megaregion [14,15].Manssour et al. developed a binary logit model for mode choice of intercity business trips in Libya [16].It is also found that the socioeconomic attributes and land use are significant factors to influence mode choice [11,17,18].Thrane examined tourists' long-distance travel mode choices by developing a multinomial logit model [18].Cohen and Harris analyzed mode choice of trips visiting friends and relatives [19].
Meanwhile, the research on intercity trip destination choice is mainly conducted for tourists' intercity travel, where mode choice is also considered in a destination choice model [20].Destination choice models are also developed for intracity trips.When a destination choice model is built up for intracity trips, mode choice is usually also incorporated as an important explanatory variable [21,22].
As Koppelman indicated in 1989, the intercity travel decisions are interrelated and cannot be dealt with separately; a joint model is therefore needed to better address interrelations among multiple travel decisions [23].Actually, a joint model for travel destination and mode choice is not a new modeling technique.As early as in 1981, Southworth developed such a joint model for interrelated mode and destination choices [24].Afterwards, many joint models are developed for travels within an urban area [25,26].However, few studies are found to develop a joint modedestination choice model for a megaregional area possibly due to the lack of survey data on megaregional intercity long-distance travels.Yao and Morikawa developed a nested structure of integrated intercity travel demand model for induced demand and applied the model to evaluate intercity transport projects in Japan [27].The intercity trips were classified into business and nonbusiness trips.Their study is focused on an intercity corridor along six megaregions, rather than inside a megaregion (the mean distance of trips in the Revealed Preference and Stated Preference surveys is larger than 300 km, including air trips).However, most intercity trips in a megaregion are less than 300 km and therefore travel modes generally do not include airplane.Thus, there is a lack of papers in the literature on the topic of joint destinationmode choice model for intercity trips within a megaregion.
Besides, most existing studies are conducted in the context of developed countries but it is rare to see similar work in the context of a developing country like China.In fact, due to economic and cultural differences between developing and developed countries, megaregional travel mode choice behaviors are quite different.Take US as an example, the market share of auto mode is dominant but in this megaregion of China the rail and bus modes take more than 40%.In particular, the development of high-speed rail plays an important role in forming megaregions in a developing country like China by facilitating the intercity transportation and collaboration.Thus, it is necessary to develop an empirical model especially for the megaregion in a developing country.
To fill this gap, the authors make an attempt to develop such a joint destination and mode choice model for intercity trips based on an intercept travel survey in Yangtze River Delta Megaregion.In this paper, the joint destination-mode choice model is developed by two purposes, including workrelated business and personal business, with consideration of available sample size and experience from the literature.And with the economic development and travel demand diversity in the developing country, some new attraction variables are used to measure the attractiveness of city for intercity trips within a megaregion.It is also informative to apply the joint model into practice to evaluate how the investment on multimodal transportation system helps to build up a strong connection between cities in a megaregion.For practice, the authors also estimate a practical model without personal characteristics for all the trips in the sample.In a scenario analysis, the practical model is applied to predict the change in daily passenger number from Shanghai to Yangzhou under the base scenario and a future scenario with high-speed rail service that can substantially shorten the intercity rail travel time.The aim of the paper is to make contribution to the literature on modeling megaregional travel destination and mode choice behaviors in the context of developing country.

Modeling Methodology
A nested logit discrete choice modeling method is employed to quantify the joint destination and mode choice probabilities for intercity trips within the megaregion.The benefit from developing a joint model is to better deal with a multimodal transportation system consisting of alternative travel modes such as auto, rail, and bus. Figure 2 shows the nesting structure of the joint destination-mode choice model, where the mode choice is placed at the lower level and the destination city choice at the upper level.It is possible that a travel mode is not available between a pair of cities, like the pair between origin city and city 15 in Figure 2. If it is the case, the exponential term of the corresponding utility, as in (3) below, will not be added into the composite utility for model estimation.
Two-level nested logit model is estimated in a standard two-step process [28].The first step is to estimate the lowlevel multinomial logit model for mode choice.The levelof-service attributes of alternative travel modes are specified into utility functions as generic variables while travelers' demographic and socioeconomic attributes are specified as alternative-specific variables.The systematic component of the utility function is formulated as In (1), "" is an index for person and "" is an index for travel mode."  " represents a vector of travel cost variables for mode "," associated with a vector of coefficients "  .""  " represents a vector of attributes for person "," associated with a vector of coefficients "  ."In a multinomial logit model, the probability of person "" to choose mode "" is given as  Then, the maximum likelihood estimation method can be applied to estimate coefficients in vectors   and   .After the lower-level mode choice model is estimated, composite utilities (CU, also called inclusive values) between each pair of cities for each traveler "" are calculated by a "log-sum" formula: where "" and "" represent indices of origin and destination cities.Then, the systematic component of utility function for the upper-level destination choice model can be formulated as In (4), "  " is a vector of attraction variables for the destination city "," associated with a vector of coefficients "." "" is a nesting coefficient that links the lower-level mode choice model with the upper-level destination choice model.A reasonable value for "" should fall into the interval between 0 and 1.Then, the probability of person "" in origin city "" to choose destination city "" can be formulated as another multinomial logit model: The maximum likelihood estimation method needs to be applied one more time to estimate coefficients in the vector "" as well as the nesting coefficient "."Note that since the data are collected from long-distance transportation terminals (e.g., rail stations, intercity bus stations and service centers, rest areas of highway, etc.), a choicebased sampling method is essentially used to collect travel data.For consistent model estimation results, a weighted exogenous sample maximum likelihood (WESML) method is applied to estimate model coefficients [29].The weights are calculated based on the ratio between the sample market shares and the market shares from an official report based on a large-scale travel survey conducted in the region [30], as shown in Table 2.  Survey questionnaire is designed to collect travelers' most recent megaregional intercity trip's information, including origin and destination, purpose, mode, travelers' attitudes to intercity travel, and their demographic and socioeconomic characteristics.A total of 1632 completed surveys were collected and in the data cleaning process, we excluded surveys which were highly incomplete, apparently frivolous, or with major inconsistencies.A final dataset of 1247 valid cases was obtained for analysis and the effective rate is about 76%.There are 434 respondents choosing the trip purpose of work-related business and 813 respondents choosing the trip purpose of personal business.It is unfortunate that many respondents did not provide complete addresses for their trip origin and destination, possibly due to privacy or security concerns, so that those locations cannot be identified.
The models are first developed by two market segments as per trip purposes, work-related business and personal business, to reflect the different values in choosing intercity travel destination and mode.Then, based on the entire trip sample, a model with no traveler's attributes is also estimated for simulation purpose.Tables 3 and 4 provide household and person characteristics of samples by two market segments (i.e., work-related business and personal business).Personal business actually includes sightseeing, visiting friends or relatives, seeing doctors, and shopping.The household size distribution infers that the sample mean value is about 3.2.More than 40% of households own private vehicle, which is a reasonable number for this megaregion.Although the ratio in the central city Shanghai is less than 40% due to its strict license plate policy, many other big cities, like Hangzhou, Nanjing, in the region have a ratio much higher than 40%.
As for person characteristics of the samples, more males show up in the sample of work-related business trips, which should be expected.Most intercity travelers for personal business fall into the age group of 18-29 years, possibly because those persons just become independent but less constrained by household obligations.Most intercity travelers for work-related business have ages between 18 and 39 years, which coincides with the fact that people are energetic in this age group and therefore undertake more work-related business trips.Meanwhile, 69.9% of work-related travelers have driver license, which allows for renting a car to drive during a business trip.Most work-related travelers fall into personal annual income category of 50,000 to 100,000 Yuan but most travelers for personal business fall into the low income category of "less than 30,000 Yuan."Two samples show similar distributions in education level.

Intercity Skim Matrices.
For joint destination-mode choice model development, intercity skim matrices need to be generated to provide travel costs between each pair of cities.It is challenging to collect the entire multimodal transportation network in such a big megaregion and then create skim matrices using shortest algorithms.Thanks to the Internet Age, there are multiple sources to create such skim matrices without a need to collect the entire network data.
One source is the website of Baidu (http://www.baidu.com/),which offers web-based services like search engine for websites, audios, and images as well as map search and navigation (http://maps.baidu.com/).Its map service provides an Application Programming Interface (API, referring to http://lbsyun.baidu.com/)that allows users to write javascripts to automatically geocode a batch of addresses and calculate the best driving route and time between each pair of two points on the map.For this study, highway skim matrices were automatically generated by using the Baidu Map API.
Unfortunately, the function for generating intercity transit cost is not incorporated in the API but can be obtained by visiting and searching some other official websites (e.g., http://www.12306.cn/mormhweb/,http://chezhan.12308.com/).A researcher in the team searched in those websites and manually collected the skim matrices for high-speed rail, regular-speed rail, and intercity bus within the megaregion.According to an official report based on large-scale travel survey (Shanghai Urban/Rural Development and Transportation Committees, 2010), air mode carries less than 1% of intercity passenger trips in this region and is therefore ignored in mode choice models.5 provides all the model estimation results for work-related business trips, personal business trips, and all trips.In mode choice models, four travel modes (i.e., auto, high-speed rail, regular-speed rail, and intercity bus) are modeled for intercity trips within the megaregion.The bus is considered as the base alternative and then its alternative-specific constant is fixed at 0. The alternative-specific constant of auto mode is less negative than those of high-speed and regular rails, indicating that auto is generally a preferred mode over rails for workrelated business trips.The travel times by various intercity travel modes are specified as a generic variable, which take the same coefficient of −0.0114.This coefficient is somewhat less negative than that of intracity work or work-related trips [31].It is possible that intercity travelers in developing countries tend to budget more time on long-distance travel  6 shows that the log-likelihood ratio index at constant is 0.123, which is a reasonable number for intercity business trips.The lower part of the table provides the estimation results of the upper-level destination choice model.The nesting coefficient for the composite utility from the lower-level mode choice model is 0.4286, falling into a reasonable range between 0 and 1.This value indicates a high correlation among travel modes between each pair of cities.Four attraction variables at the city level, including GDP per capita, city area, service employment ratio, and city population density, are specified in the utility function.All of them take significantly positive coefficients and therefore serve as attraction factors.GDP per capita is a measurement of economic development in a city.A well-developed city with higher GDP per capita will certainly attract more work-related business trips.City population density is a measurement of the concentration of residents and a city with higher population density probably generates more business opportunities.Service employment ratio is a good indicator of the service industry development, which should play a positive role in attracting work-related business trips.City area serves as a control variable for the size of a city.As per Table 6, the log-likelihood index ratio, an overall goodness-of-fit of the model, is 0.237, which is quite good for a destination choice model.5 provides the model estimation results for personal business trips.Similar to the model for work-related trips, the alternative-specific constant of auto is less negative than that of rails, indicating auto is a preferred alternative.The generic variable of travel time by different mode takes a negative coefficient.Only bus fare appears significant in bus utility function while rail fares do not appear significant and are therefore excluded from the final model.The value of time can be estimated at about 39 Yuan/Hour (=0.0097/0.0150× 60), which is a bit higher than that of work-related trips.It is reasonable that travelers value their times for personal trips more highly than those for work-related trips.Two alternativespecific dummy variables, private car ownership and being licensed, take similarly positive coefficients in auto utility functions.Household annual income takes a negative coefficient in bus utility function, indicating that high-income travelers are less likely to ride intercity bus for personal business trips.The log-likelihood ratio index is 0.136, which is comparable to that of the model for work-related trips.

The Model for Personal Business Trips. The middle part of Table
As for the upper-level destination choice model for personal trips, the nesting coefficient for the composite utility takes the value of 0.8969, falling into the reasonable interval between 0 and 1.However, unlike that in work-related trip model, this value is close to 1, indicating a low correlation among travel modes between each pair of cities.It means that mode and destination choices are almost independent for personal business trips but it is not the case for workrelated trips.A plausible explanation is that work-related business trip's mode and destination are usually chosen by a traveler's employer, who may be more concerned about available travel modes while choosing the destination of a work-related business trip.
Similar to the model for work-related trips, service employment ratio, city area, and city population density are specified as city attraction variables.The only difference is the use of the number of "5A" tourist attractions instead of GDP per capita since personal business trips include sightseeing trips.It takes a positive coefficient, indicating that top-level scenic spots of a city can really attract more intercity personal trips.Since personal business trips include the trips for seeing doctors and shopping, service employment plays a positive role in attracting personal trips for those purposes.And visiting trips should be positively associated with population, which explains why population density takes a positive coefficient and becomes an attraction variable.Again, city area serves as a control variable for city size.The log-likelihood ratio index is 0.073, being lower than that of the work-related trip model.It is probably because more heterogeneities exist among travelers making personal trips, which lowers the overall goodness-of-fit of the destination choice model.

Model Simulation Results
In this megaregion, Yangzhou is a famous city with long history, located at the crossing of Yangtze River and Grand Canal of China (the longest canal in the world).Since there is no high-speed rail service between Yangzhou and Shanghai, it takes about 6 hours to ride a regular train to travel between these two cities.The backward intercity rail system will definitely hinder the further economic development of Yangzhou.Currently, the Yangzhou government has realized the importance of an efficient multimodal intercity transportation system and raised a plan to build up highspeed rail to connect it with Shanghai, the central city of the megaregion.It is expected that the in-vehicle rail travel time will be shortened to 2 hours after the highspeed rail system is operated.It will be quite informative to apply the developed model to predict the change in intercity passenger travel demand from Shanghai to Yangzhou for evaluating the infrastructure investment and demonstrating the applicability of the model.For model simulation, two scenarios, called "Before" and "After," are built up, as shown in Table 7.In the "Before" scenario, high-speed rail service is not available and it takes 360 minutes to use regular rail service to travel between Shanghai and Yangzhou.In the "After" scenario, a highspeed rail system is built up and the in-vehicle travel time is reduced to 120 minutes while the level-of-service attributes of other modes remain the same as those in "Before" scenario.The advantage of a joint destination-mode choice model is to allow for jointly predicting the travel mode shift and destination change.Since the addition of the new high-speed rail service will increase the utility between these two cities in the destination choice model, the total number of trips between them is expected to increase.The total number of passenger trips originating from Shanghai to other cities in the megaregion is 745 thousand per day, which can be found in the official report based on a large-scale travel survey in 2010 [27].The joint destination-mode choice model is applied to distribute this total number of trips to each city in the region and then to each available travel mode.
The simulation results in two scenarios and their differences are listed in the lower part of Table 7.As shown, the total number of passengers will increase more than 18% after the high-speed rail is in service.Further simulation results for mode shift show that 2,817 passengers per day can be expected to ride high-speed trains from Shanghai to Yangzhou.Among those, 2,236 passengers (79.4%) change their trip destination from other cities to Yangzhou while 513 passengers (18.2%) switched to high-speed rail mode from auto mode, 9 (0.3%) from regular rail mode and 59 (2%) from bus mode.Nearly 5% reductions are observed in auto, regular rail, and bus passenger numbers from Shanghai to Yangzhou between two scenarios.The model simulation results strongly support the high-speed rail plan raised by the Yangzhou government for greatly improving its connection to the central city Shanghai and enhancing its competitive power within the megaregion.On the other hand, it also demonstrates that the developed model can be applied to evaluate a real-world project based on reasonable simulation results.

Conclusions and Discussions
In this paper, joint destination-mode choice models are developed for intercity passenger trips within Yangtze River Delta Megaregion, the largest megaregion of China.The models are developed by two different trip purposes (i.e., work-related business and personal business purposes) as well as for all trip purposes.A nested logit modeling framework is applied to model trip destination and mode choices in two different levels, where the lower level is a mode choice model and the upper level is a destination choice model.The utility values from travel modes in the lower level are summarized into a "log-sum" formula to form a composite utility.The composite utility is then specified into the destination choice model as an intercity impedance factor so that a destination choice will be sensitive to level-of-service change in an existing travel mode or addition of a new travel mode.Travel times by auto, high-speed, and regular rails, intercity bus and bus fare, as well as travelers' demographic and socioeconomic characteristics, are found to be significant variables in mode choice models for two different trip purposes.In destination choice models, significant city attraction variables include population density, GDP per capita, the ratio of employment in service sector, "5A" tourist attraction number, and city area.
A practical model for all trip purposes is estimated and applied to predict changes in passenger numbers by travel modes from Shanghai to Yangzhou between the scenarios with and without high-speed rail service.It is found that the total number of passengers from Shanghai to Yangzhou will increase by 18.7% after the high-speed rail is in service.This simulation result aids in evaluating how the huge investment on the high-speed rail project improves the Yangzhou's connection to the central city of Shanghai and enhances its competitive power against other cities in the megaregion.
Finally, two limitations of this study need to be discussed.At first, exact origin and destination locations of intercity trips are not identified in the survey due to a significant amount of missing data.As a result, travel time or distance for access to or egress from terminal stations cannot be specified into the model.Secondly, some level-of-service attributes, like rail fare, do not appear significant in the model.It may be caused by the low variation of rail fare in the sample.For better estimating the coefficient of rail fare, some stated preference (SP) questions may be added into the survey to increase the variance of those variables in the sample.These are good lessons learned from this study, which raises our interest in future research to improve the survey design for gaining more knowledge on megaregional intercity travels.

Figure 2 :
Figure 2: The nesting structure of the joint destination-mode choice model.

Table 1 :
Characteristics of 16 cities in Yangtze River Delta Megaregion.Taizhou 1 and Taizhou 2 are actually two different cities with almost the same pronunciation but different names written in Chinese.

Table 2 :
Weights based on market shares from sample and large survey report by trip purpose.
4.1.City Attraction Variables.Table1provides characteristics of 16 cities in Yangtze River Delta Megaregion, including area, population, GDP, the number of employees by three different sectors (denoted as "Emp1," "Emp2," and "Emp3"), the number of "5A" tourist attractions ("5A-TA").As shown, the central city Shanghai has the greatest population and economic size in this region.In China, employment is usually classified into three sectors.The first sector includes agriculture, forestry, fishery, and animal husbandry, the second sector includes manufacturing industry, mining industry, and construction, and the third sector is actually the service sector including commerce, finance, insurance, transportation, communication, and education.In this region, 4.2.Travel Survey.The data for model development was collected from intercept travel surveys at multiple longdistance transportation terminals in those cities, including intercity rail stations, intercity bus stations, service centers, and rest areas of expressway.About fifty students at Tongji University were recruited to distribute questionnaires and

Table 3 :
Household characteristics by trip purpose.
conduct on-site surveys during April and June of 2012.

Table 4 :
Person characteristics by trip purpose.

Table 5 :
Model estimation results by trip purpose.

Table 6 :
Model statistics by trip purpose.less sensitive to intercity travel time.Transit fares, including both rail and bus fares, are specified into the model as alternative-specific variables.It is found that bus fare appears significant in the bus utility function but rail fares do not appear significant and are excluded from the final model.The coefficients of fare and travel time in the bus utility function allow for estimating the value of travel time at about 34 Yuan/Hour (=0.0114/0.0201× 60).In addition, private auto ownership and being licensed are specified as two alternative-specific dummy variables in the auto utility function.Both of them take highly positive coefficients in the model, as expected.Female dummy variable takes a negative coefficient in auto utility function, indicating that females are less likely to use auto for work-related business trips.In China, females usually do not favor driving as much as males.Household annual income is originally a categorical variable but converted into a continuous variable and specified into the model.It takes a negative coefficient in the bus utility function, indicating that high-income travelers are less likely to use bus.Table

Table 5
provides the model estimation results for all the trips without traveler's attributes.The reason for estimating this model

Table 7 :
Simulation results of passenger trips between Shanghai and Yangzhou.to facilitate the model simulation procedure.Otherwise, one needs to know city-level trip frequency by trip purpose and city-level traveler's attributes but those data are not available to researchers.In the mode choice model, the values of alternative-specific constants greatly change due to the exclusion of some explanatory variables but those constants still indicate that auto is a preferred mode over rails.The other coefficients in the mode choice model are close to those in the models by trip purpose.In the destination choice model, all the values of coefficients fall between those in models by trip purpose probably because the model is estimated based on a pooled sample consisting of trips for work-related business, personal business, and other purposes.Then, this practical model is applied for scenario analysis in the next section. is