Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a wellknown dataset.
Understanding the utilization of public transport is important for policy design and urban traffic planning. From a behavior analysis perspective, such utilization can be analyzed in terms of a binary choice problem in which the traveler must choose between public transit and a private mode of transport. Previous studies usually employ logistic regression models to discuss this binary choice problem. These models can be used to predict choice probability and to evaluate the effect of various attitudes on the utilization of public transport. The effects of demography, travel cost, travel time, and accessibility on such utilization can be analyzed through estimating the parameters of these models.
McGillivray [
One benefit of using a logistic regression model to analyze the utilization of public transport is that the model can consider the combined effects of attributes through a linear combination and can easily evaluate the contribution of various attributes to such utilization. Previous studies agree that travel time is an important variable for formulating a logistic regression model to analyze this binary choice problem. Although previous studies usually prefer to treat travel time as a deterministic variable in the model, one cannot neglect that travelers’ perception error can prevent travelers from accurately evaluating their actual travel time. On the other hand, travelers also cannot say exactly how long a given travel time, for instance, 10 minutes, actually is. Carrion [
As aforementioned, researchers have recognized the effect of perception errors regarding travel time on travelers’ choice behavior. However, the classic formulation of a random error term in logistic regression models cannot reflect this perception error appropriately (Chen et al. [
We first propose a basic hierarchical model and then develop an extended model in this study. The extended model allows us to constrain the sign of perception error regarding travel time. Correspondingly, this study also develops two Gibbs samplers to estimate the parameters of the proposed models. We evaluate the performance of the proposed models using a wellknown dataset provided by Horowitz [
For simplicity, we describe the proposed model based on a binary choice problem that was provided by Horowitz [
DCOST is “public transport fare minus private car travel cost,”
CARS is “private cars owned by the traveler’s household,”
DOVTT is “public transport outofvehicle time minus private car outofvehicle time,”
DIVTT is “public transport invehicle time minus private car invehicle time.”
We formulate the choice problem through a logistic regression model. If we present this logistic regression model as a latentvariable model, then the model can be obtained as follows:
The logistic regression model shown by (
As shown by (
The DAG of the parameters of the hierarchical logistic regression model.
First, we discuss how to estimate parameters of the model shown by (
We develop a Gibbs sampler to draw the samples of
Set
The set of conditional distributions
for
for
end
end
The conditional distribution
The conditional distribution
The conditional distribution
The conditional distribution
Applying the Bayesian theorem, we obtain
Now, we discuss how to estimate the parameters of the extended model described by (
We derive the formulation of
Draw
If
The proposed model structure also allows us to further investigate traveler’s perception error regarding both DIVTT and DOVTT. To do this, we just need to modify (
To estimate the parameters of the proposed model, we can apply the sampling algorithm to draw random samples for
In this section, we use a modal choice dataset provided by Horowitz [
Data structure of the dataset from Horowitz (1993) [
Attribute  Scale 

CARS  0~7 
DCOST  −111~89 
DOVTT  −6~48 

−59~102 
CHOICE  CAR = 1, TRANSIT = 0 


Number of samples  842 
We first estimate the hierarchical model defined by (
Estimates of the parameters of the basic model defined by (
Parameter  Value  Lower 95% CI  Upper 95% CI  SD 



−1.1731  −1.179  −1.1671  0.3023  <0.0001 

2.3149  2.3105  2.3193  0.2233  <0.0001 

0.0172  0.0171  0.0173  0.0038  <0.0001 

0.0620  0.0617  0.0624  0.0182  <0.0001 

0.0096  0.0095  0.0098  0.0096  <0.0001 

−3.5245  −3.5521  −3.4970  1.9869  <0.0001 


DIC  465.764  
Number of observations  894 
We estimate the parameter of the extended model through the sampling scheme with HM step. We draw 20,000 samples of the parameter and also treat the first 5,000 samples as the burnin procedure. The DIC of the extended model is 461.687. This result indicates that the performance of the extended model is better than that of the basic model defined by (
The samples of the parameters are used to investigate the shapes of the distributions of the parameters. Table
Estimates of the parameters of the extended hierarchical model defined by (
Parameter  Value  Lower 95% CI  Upper 95% CI  SD 



−1.2490  −1.2531  −1.2448  0.3012  <0.0001 

2.3322  2.3292  2.3352  0.2161  <0.0001 

0.0174  0.0174  0.0175  0.0038  <0.0001 

0.0590  0.0587  0.0592  0.0190  <0.0001 

0.0156  0.0155  0.0158  0.0101  <0.0001 

−1.022  −1.0234  −1.0206  0.1026  <0.0001 


DIC  461.687  
Number of observations  894 
The histogram for
The histogram for
In addition to the improvement of the performance, the main contribution of the proposed model is that it can allow us to capture the perception error on travel time and analyze the characteristics of the perception error. Let us look at the extended model. As shown by Table
To further investigate the property of
Mean of the samples of
Variance of the samples of
One can find that the mean of the samples of
To investigate traveler’s perception error regarding both DIVTT and DOVTT, we also use Algorithm
Estimates of the parameters of the extended hierarchical model defined by (
Parameter  Value  Lower 95% CI  Upper 95% CI  SD 



−1.3607  −1.8495  −0.8590  0.3005  <0.0001 

2.2569  1.9034  2.6305  0.2131  <0.0001 

0.0149  0.0086  0.0213  0.0039  <0.0001 

0.0907  0.0569  0.1250  0.0209  <0.0001 

0.0082  −0.0072  0.0247  0.0097  0.1973 

−0.4310  −0.5229  −0.3363  0.0559  <0.0001 

−0.5801  −0.6804  −0.4785  0.0558  <0.0001 


DIC  436.240  
Number of observations  894 
This study proposes a logistic regression model with a hierarchical random error term to analyze the binary choice problem. The proposed model can account for travelers’ perception errors regarding attributes. Since a number of studies have shown perception error regarding travel time to have a significant impact on modal choice, this study focuses in particular on how to capture this error from behavior data.
We construct a hierarchical random error term structure in the logistic regression model though introducing a random error term for the travel time variable. In the proposed model, travel time is no longer a deterministic variable. To make the model more sensible, we also propose an extended model, where the sign of the DIVTT and DOVTT variables can be constrained. We develop a Gibbs sample to estimate the basic hierarchical model, while developing a Gibbs sampler with MH step to estimate the parameters of the extended model. The binary choice dataset provided by Horowitz [
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is jointly supported by the National Basic Research Program of China (no. 2012CB725403), the Fundamental Research Funds for the Central Universities (no. 2014JBM056), and the National Natural Science Foundation of China (no. 51408035).