The stated preference experimental design can affect the reliability of the parameters estimation in discrete choice model. Some scholars have proposed some new experimental designs, such as Defficient, Bayesian Defficient. But insufficient empirical research has been conducted on the effectiveness of these new designs and there has been little comparative analysis of the new designs against the traditional designs. In this paper, a new metro connecting Chengdu and its satellite cities is taken as the research subject to demonstrate the validity of the Defficient and Bayesian Defficient design. Comparisons between these new designs and orthogonal design were made by the fit of model and standard deviation of parameters estimation; then the best model result is obtained to analyze the travel choice behavior. The results indicate that Bayesian Defficient design works better than Defficient design. Some of the variables can affect significantly the choice behavior of people, including the waiting time and arrival time. The Defficient and Bayesian Defficient design for MNL can acquire reliability result in ML model, but the ML model cannot develop the theory advantages of these two designs. Finally, the metro can handle over 40% passengers flow if the metro will be operated in the future.
The stated preference has become the primary way of acquiring travelers’ preference data regarding different transport services. In the survey process, respondents choose their preferred travel mode in some hypothetical scenarios. The designs of these hypothetical scenarios depend on the experimental design. Thus, the stated preference questionnaire design is based on the experimental design method [
There are some limitations in the existing papers on experimental investigations of the stated preference experimental design. In these works, the predominantly used model is the multinomial logit (MNL) model [
Therefore, in this paper, the new metro between the city of Chengdu and one of its satellite cities is taken as the research subject. From an empirical point of view, comparative analysis of the new designs against the orthogonal design was made for an estimation of the model parameters; the validities of the Defficient and Bayesian Defficient design are empirically verified under different sample sizes. We would demonstrate whether these designs generated specially for the MNL model could work well in ML model based on the actual data.
As a traditional stated preference experimental design, orthogonal design has been widely used in discrete choice model. The orthogonal design can reduce the number of combinations. By selecting mutually orthogonal combinations, the orthogonal design avoids the model parameter estimation errors that are caused by the correlation among the attributes [
The theoretical basis of the Defficient design is to obtain the minimum value of the determinant of the asymptoticcovariance matrix (AVC) of the model. The essence of this design resides in the goal of obtaining minimum values of the estimated standard deviations of the model parameters, thereby obtaining more reliable parameters estimation results. This design explicitly considers the importance of alternative attributes to ensure that the combinations can present more tradeoff information in the selection process, with the goal of maximizing the respondents’ preference information [
According to the theoretical basis of the Defficient design, when using this method, first, a prior AVC or prior parameter estimation results must be obtained to determine the experimental combination iteratively, using the minimum Derror value. Usually, we identify the previously obtained model AVC or model parameter estimate value as the prior information. How this prior information is obtained before the stated preference questionnaire is designed remains an issue that must be overcome when adopting this method. To overcome this issue, researchers collect data using a pilot survey. The pilot survey data are used to estimate the model parameter. The parameter estimation values are used as the prior information for questionnaire design. The mathematical derivation of the Defficient design, based on related theory, is as follows:
The AVC matrix is the inverse matrix of the Hessian matrix of the maximum likelihood function for the discrete choice model, as shown below:
The specific form of the Hessian matrix is given as follows:
The maximum likelihood function for the discrete choice model is shown below:
It can be observed that the Defficient design is actually the inverse of the process that is used for parameter estimation in the discrete choice model. The theoretical bases for Bayesian Defficient and Defficient are identical. The only difference is that, in the Bayesian Defficient design process, the prior values of the attribute parameters obey a certain distribution [
In this paper, the new metro between Chengdu and Longquan (a satellite city) is used as the research subject. Based on the MNL model, the influence of different experimental design on model parameter estimation is verified. Additionally, whether the design generated specifically for the MNL model is able to obtain reliable parameter results in a more complex ML model is verified. At present, the existing modes of transportation between the cities include bus, taxis, private cars, and coach. With the continued development of the two cities, Chengdu is ready to introduce the metro to meet the diverse travel needs of its residents.
The intention of this stated preference questionnaire is to investigate the travelers’ choice preference when introducing a subway system. Therefore, the alternative set in the stated preference questionnaire design process contains five alternatives, namely, the metro, bus, taxis, private cars, and coach. The alternative attributes include “arrival time (the time of arriving at the station),” “waiting time,” “cost,” “invehicle time,” and “offvehicle time (travel time from getting off the vehicle to the destination).” Finally, the alternative attributes and attribute levels are formed as presented in Table
The definition table of the alternative attribute levels.
Alternative  Attributes  Level_{1}  Level_{2}  Level_{3} 

Bus  Invehicle time (min)  60  90  120 
Cost (¥)  2  4  
Waiting time (min)  5  10  
Arrival time (min)  5  10  
Offvehicle time (min)  5  10  


Coach  Invehicle time (min)  40  60  
Cost (¥)  6  8  10  
Waiting time (min)  15  
Arrival time (min)  20  
Offvehicle time (min)  15  


Metro  Invehicle time (min)  30  40  
Cost (¥)  4  6  
Waiting time (min)  3  
Arrival time (min)  10  30  
Offvehicle time (min)  10  


Taxi  Invehicle time (min)  40  60  
Cost (¥)  50  70  
Waiting time (min)  5  


Car  Invehicle time (min)  40  60  
Cost (¥)  20  40 
Orthogonal design was used to design the stated preference questionnaire. According to the alternative attributes and the number of attributes shown in Table
The theoretical bases for the Defficient and Bayesian Defficient design are the same. The only difference lies in the form of prior parameter. The value of prior parameter is fixed in the Defficient design. The prior values in the Bayesian Defficient design are subject to normal distribution. As stated earlier, before the Defficient design can be applied, the researcher must obtain prior information. The MNL model parameter estimation values obtained from the pilot survey data are set as the prior information for the Defficient and Bayesian Defficient design. The stated preference questionnaire used for the pilot survey also is designed by the orthogonal design. Using the pilot survey, 192 observations were obtained. The MNL model parameter values obtained from the pilot survey data are shown in Table
The prior values of the attribute parameters.
Attributes  Parameter 
Prior parameter  Prior parameter 

(Defficient)  (Bayesian Defficient)  
Invehicle time  −0.036 (−4.458)  −0.036 

Cost  −0.038 (−2.058)  −0.038 

Waiting time  −0.027 (−0.361)  −0.027 

Arrival time  −0.117 (−6.874)  −0.117 

Egress time  −0.048 (−0.604)  −0.048 

According to the prior values for the parameters shown in Table
The prior values of attribute parameter for the Bayesian Defficient design are subject to normal distribution, but the values of the mean and variance are different. According to the prior values provided in Table
In this paper, three types of questionnaires were utilized. Facetoface survey was conducted. The preference data for travelers in different scenarios were obtained. Meanwhile, the respondents also were asked to provide personal information regarding their ages, income, and so forth. The overall survey sample size was 960. The gender rate between male and female is 1.03. The average age of overall samples is 36. The monthly average income of overall samples is 3556 (¥). By establishing four age intervals, each questionnaire was able to collect the age information from the surveyed sample. Of four groups, the 25–50 age range occupied the highest proportion. Similarly, five income ranges were established to evaluate the income levels of the respondents. The corresponding survey sample attributes of each questionnaire are shown in Table
Statistical results of the surveyed sample population.
Orthogonal  Defficient  Bayesian Defficient  

Sex  
Man: 1  162  50.6%  164  51.2%  163  50.9% 
Female: 0  158  49.4%  156  48.8%  157  49.1% 
Car ownership  
Yes: 1  100  31.3%  104  32.5%  102  31.8% 
No: 0  220  68.7%  216  67.5%  218  68.2% 
Age  
18~24  55  17.2%  56  17.5%  55  17.2% 
25~50  251  78.4%  250  78.1%  252  78.7% 
51~60  8  2.5%  8  2.5%  8  2.5% 
>60  6  1.9%  6  1.9%  5  1.6% 
Income  

54  16.9%  55  17.2%  54  16.9% 
2000~4000  102  31.9%  103  32.2%  103  32.2% 
4001~6000  95  29.7%  97  30.3%  96  30% 
6001~8000  42  13.1%  40  12.8%  41  12.8% 

27  8.4%  25  7.8%  26  8.1% 
Total 



First, we analyze whether a difference exists in the variance of the error term for the three types of experimental designs. The model result is shown in Table
The NL model results based on the three experimental design methods.
Variables  Parameter ( 
Standard deviation (σ) 


Cost  −0.024  0.0036  −6.685 
Waiting time  −0.123  0.0164  −0.752 
Arrival time  −0.095  0.0034  −27.922 
Invehicle time  −0.022  0.0014  −15.550 
Offvehicle time  −0.013  0.0165  −0.764 

1.0  

1.0  

1.0  
Number of observances  3840  
LL ( 
−4010.982  

0.351 
Table
The MNL model results based on the different experimental design methods.
Variables  Orthogonal  Defficient  Bayesian Defficient  

Parameter ( 
Standard deviation (σ)  Parameter ( 
Standard deviation (σ)  Parameter ( 
Standard deviation (σ)  
Cost  −0.033  0.0067  −0.018  0.0063  −0.023  0.0062 
(−4.870)  (−2.882)  (−3.788)  
Arrival time  −0.102  0.0063  −0.096  0.0060  −0.096  0.0058 
(−16.012)  (−16.07)  (−16.376)  
Waiting time  −0.044  0.0297  −0.003  0.0297  −0.014  0.0281 
(−1.480)  (−0.120)  (−0.498)  
Invehicle time  −0.032  0.0031  −0.031  0.0028  −0.013  0.0022 
(−10.580)  (−11.004)  (−5.884)  
Offvehicle time  −0.001  0.0306  −0.004  0.0296  −0.018  0.0280 
(−0.031)  (−0.168)  (−0.673)  
Number of observance  1280  1280  1280  
LL (φ)  −1354.658  −1302.555  −1294.455  

0.140  0.145  0.189 
The Defficient and Bayesian Defficient design can realize the theoretical goals in the real situation, namely, minimizing the standard deviations of the estimated model parameters. To further validate the use of these two experimental designs in actual environments, this paper compared the MNL model estimated parameter standard deviations based on the three experimental designs under different sample sizes, as shown in Figure
The standard deviation of the parameter estimates for different sample sizes. (a) Cost, (b) invehicle time, (c) waiting time, (d) arrival time, and (e) offvehicle time.
From Figure
In this paper, the data are collected by the three stated preference questionnaires, which generated specifically for MNL model. The survey data was used to estimate the ML model parameters. The ML model results are shown in Table
The ML model results based on the different experimental design methods.
Variables  Orthogonal  Defficient  Bayesian Defficient  

Parameter ( 
Standard deviation (σ)  Parameter ( 
Standard deviation (σ)  Parameter ( 
Standard deviation (σ)  
Cost  

−0.053 (−4.265)  0.0125  −0.042 (−3.111)  0.0136  −0.033 (−2.947)  0.0112 

0.042 (3.440)  0.0121  0.042 (3.505)  0.012  0.025 (1.794)  0.0141 
Arrival time  

−0.114 (−13.607)  0.0083  −0.104 (13.577)  0.0076  −0.229 (−6.767)  0.0339 

0.005 (0.350)  0.1465  0.0011 (0.034)  0.0318  0.169 (4.845)  0.0349 
Waiting time  −0.054 (−1.675)  0.0322  −0.005 (−0.166)  0.0321  −0.021 (−0.598)  0.0357 
Invehicle time  

−0.038 (−8.136)  0.0047  −0.033 (−8.536)  0.0038  −0.013 (−4.925)  0.0027 

0.013 (2.337)  0.0055  0.007 (1.035)  0.007  0.004 (0.826)  0.0047 
Offvehicle time  0.008 (0.250)  0.0328  −0.008 (−0.249)  0.0321  −0.015 (−0.459)  0.0345 
Number of observance  1280  1280  1280  
LL (&)  −1350.245  −1297.157  −1279.329  

0.143  0.147  0.134 
In the ML model “cost,” “invehicle time,” and “arrival time” are set as the random variables in the utility functions and are subject to the
Based on the above analysis, the Defficient and Bayesian Defficient design also can achieve good parameter estimation results in the ML model. Comparing the standard deviations based on the two experimental methods in ML model, the conclusions drawn for the MNL model are not reflected. A degree of bias was present in some parameters, for example, the waiting time. To further investigate the validity of the theoretical advantages of these two experimental designs in different models under different sample sizes. The standard deviations of the estimated parameters between the MNL model and the ML model were compared based on the Defficient design. As shown in Figure
The standard deviation for the estimated parameter of the alternative attribute under different sample sizes. (a) Cost, (b) invehicle time, (c) waiting time, (d) arrival time, and (e) offvehicle time.
Through the above analysis, the reliability of the parameter estimation results obtained from the Bayesian Defficient design was better. Therefore, the survey data obtained from the Bayesian Defficient design could be chosen to calibrate the MNL model, in order to analyze the traveler behavior in the ChengduLongquan corridor. The model results are shown in Table
The MNL model results for analyzing the choice behavior from Chengdu to Longquan.
Variables  Parameter 

Variables  Parameter 


Passenger bus constant  5.163  2.734  Sextaxi (male: 1)  −0.027  −0.047 
Public bus constant  5.956  5.571  Sexmetro (male: 1)  −1.327  −5.255 
Taxi constant  0.005  0.003  Agepassenger bus  −1.215  −1.182 
Metro constant  7.866  7.628  Agepublic bus  0.538  1.119 
Cost  −0.028  −2.795  Agetaxi  −0.553  −0.751 
Arrival time  −0.127  −17.106  Agemetro  −0.392  −0.856 
Waiting time  −0.016  −0.486  Incomepassenger bus  −0.225  −0.330 
Invehicle time  −0.024  −8.213  Incomepublic bus  −0.828  −4.197 
Offvehicle time  −0.018  −0.515  Incometaxi  1.138  4.371 
Sexpassenger bus  −1.001  −0.515  Incomemetro  −0.166  −1.082 
(Male: 1)  
Sexpublic bus  −0.604  −2.130  Car ownership  6.211  14.894 
(Male: 1)  (Yes: 1)  


LL (&)  −737.560  
Number of observances  1280  

0.481 
The key factors affecting the traveler choice behavior in ChengduLongquan corridor include “cost,” “invehicle time,” and “arrival time.” Currently, the distance between Chengdu and Longquan is 23 kilometers, but the average invehicle time for the bus is relatively long (approximately 1 hours), and the invehicle time for the coach is 40 minutes. Therefore, neither of these travel modes is time efficient, making “invehicle time” a significant factor. In the current corridor status, the number of bus lines is limited. The distribution of the bus lines is not even. Additionally, the coach is not convenient because a traveler must go to certain stations to ride the coach. However, the Chengdu subway system has not been formed into a network, so there remain significant limitations associated with riding the metro. Therefore, the variable “arrival time” became a key factor on traveler choice behavior. In the model results, the alternative constant was very significant, showing that some factors that cannot be quantified also impact the traveler choice behavior.
The personal attributes of travelers also significantly impact their choice behavior. Men exhibit a weak preference for public transport, such as the bus and metro. Compared to public transport, men prefer the use of private cars. Highincome groups tend to choose private cars and taxis. In particular, private car owners tend to travel with private cars, while lowincome groups are more inclined to use public transport, which has relatively low costs.
Ultimately, according to the model results, it was predicted that the rate of sharing of the five alternatives in the corridor after the opening of the metro would be 0.48% for coach, 24.47% for bus, 1.64% for taxis, 28.69% for private cars, and 44.72% for the metro. The metro is expected to take on a significant amount of traffic demand, demonstrating the strong appeal of metro for travelers in this corridor.
The experimental design is a key factor affecting the reliability of the parameters estimation in discrete choice model. There are some new experimental designs, for example, Defficient design, but insufficient empirical research has been conducted on the effectiveness of the new designs and a little comparative analysis of the new designs against the traditional design. In the paper three kinds of stated preference questionnaires can be designed based on the three types of experimental designs, respectively. The preference data of travelers can be achieved by these stated preference questionnaires. Based on the preference data we analyze whether a difference exists in the variance of the error term for the three types of experimental designs through the application of a NL model hierarchical division. We found there is no significant difference between the three experimental designs with respect to the variance of error term. This conclusion also supports the conclusion made by Bliemer and Rose.
The MNL model parameter estimation results based on the preference data were compared. According to the results, we can demonstrate the validity of Defficient and Bayesian Defficient design. Bayesian prior parameter is subject to certain distribution which means the analyst is uncertain about what the true parameters will be. The prior parameter is fixed which assumes the analyst has exact knowledge of the true parameters when using the Defficient design, but the true population parameters are not known accurately. So the Bayesian design may contain the true population parameters which performs better than the Defficient design.
According to ML model results, these designs which were generated specifically for the MNL model may be not applicable to the advanced ML model, for example, the orthogonal design. This conclusion is inconsistent with the conclusions made by Rose and Bliemer based on model simulation. Compared to the parameter estimation results of the MNL model based on the Defficient design with the ML model under different sample sizes, we find the theoretical advantages of the Defficient design were not fully shown in the ML model. But in this paper, we only used ML model; the other advanced model is not considered.
The preference data obtained from the Bayesian Defficient design can be chosen to estimate the MNL model parameters. The traveler choice behavior in ChengduLongquan corridor was analyzed. These attributes, such as “cost,” “arrival time,” “invehicle time,” and “income,” can impact significantly the traveler choice behavior. These results reflect that there are some shortcomings of bus on convenience and timeliness. Finally, we predict the metro can handle over 44% passenger flow if the metro will be operated in the future.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by National Science Foundation of China under Grant nos. 50908195 and 51178403, Specialized Research Fund for the Doctoral Program of Higher Education (no. 20130184110020) and the Fundamental Research Funds for the Central Universities (no. SWJTU11CX080 and no. 2682014CX130), Key Laboratory of Road and Traffic Engineering of the Ministry of Education, Tongji University (no. K201207), Program for New Century Excellent Talents in University (NCET130977), The National Basic Research Program of China (973 Program no. 2012CB725405), and the Science and Technology Innovation Practice Program for Graduate Student, Southwest Jiaotong University (no. YC201407119), and Science & Technology Department of Sichuan Province (no. 2014RZ0037).