Crossing at a Red Light : Behavior of Cyclists at Urban Intersections

To investigate the relationship between cyclist violation and waiting duration, the red-light running behavior of nonmotorized vehicles is examined at signalized intersections. Violation waiting duration is collected by video cameras and it is assigned as censored and uncensored data to distinguish between normal crossing and red-light running. A proportional hazard-based duration model is introduced, and variables revealing personal characteristics and traffic conditions are used to describe the effects of internal and external factors. Empirical results show that the redlight running behavior of cyclist is time dependent. Cyclist’s violating behavior represents positive duration dependence, that the longer the waiting time elapsed, the more likely cyclists would end the wait soon. About 32% of cyclists are at high risk of violation and low waiting time to cross the intersections. About 15% of all the cyclists are generally nonrisk takers who can obey the traffic rules after waiting for 95 seconds. The human factors and external environment play an important role in cyclists’ violation behavior. Minimizing the effects of unfavorable condition in traffic planning and designing may be an effective measure to enhance traffic safety.


Introduction
Urban traffic problem is widely recognized as one of the main maladies of life in large cities.Many scholars have paid much attention to the traffic problem 1, 2 .A Mix of non-motorized and motorized vehicles is an important traffic type in China.Some surveys show that the nonmotorized vehicle is one of the most widely used traffic tools in Chinese daily travel activity 3 .At present, cycling still has heavy proportion among all travel modes in China, and in Tianjin, as large as over 60% 4 .Even in the developed countries, bicycle travel is recognized as a low-energy consumption and as being healthy to the users and it does not damage the health of others.Meanwhile, the electric bike e-bike has emerged as a popular mode of transportation in many large cities during recent years.E-bike use has rapidly expanded in China, in the process changing the mode split of many cities.Currently, China produces over 20 million e-bikes yearly, up from a few thousands a decade ago 5 .E-bikes in China are defined as electric two-wheelers with relatively low speeds and weights compared to a motorcycle.Both bicycle-style e-bikes with functioning pedals and scooter-style e-bikes with many features of gasoline scooters are classified as bicycles and are given access to bicycle infrastructure.
However, the growing popularity of cycling traffic also entails safety concerns as observed in accident statistics.Accident analysis reveals that over 60% of fatal crashes involving cyclists result from violation of traffic rules 6 .One typical type of rule violation behavior is violation behavior at red period.Because of the poor law enforcement and peoples' low safety awareness, violation behavior at red period is rather prevalent and represents a substantial safety problem in Chinese urban intersections.Especially, electric bicycles with relatively high speed are likely to increase the risk of traffic incident.
Previous research on drivers and pedestrians also points to several variables of interest regarding violation behaviors.Keegan and O'Mahony gave reports about pedestrians' streetcrossing behavior influenced by travel distance and waiting time 7 .Other researchers paid much attention to the influences of personal features on the street-crossing behavior 8-10 .Some useful reviews of the existing research on pedestrian street-crossing behavior in urban areas can be found in Ishaque and Noland 11 and Papadimitriou et al. 12 .
Unfortunately, only a few studies have investigated the violation behavior of cyclists.Johnson et al. identified three distinct types of violated cyclists that are exposed to different levels of risk: racers, impatients, and runners 13 .Johnson et al. used video recordings to analyze urban commuter cyclists' violation behaviors in Melbourne, VIC, Australia 14 .The field observation approach was also used by researchers studying the influence of cycle paths on accident numbers 15 .In this paper, a hazard-based duration approach is adopted to describe the cyclist violation behavior at signalized intersections.The hazard-based duration models have been used extensively in biometrics and reliability engineering for decades 16 .Duration models can be used to determine causality in duration data and they are also useful tools in the field of transportation 17-21 .These models represent a type of analytical methods to describe the duration of a certain state and how various factors have affected the duration.More importantly, duration models can deal with not only uncensored data but censored data.For example, the exact waiting duration reflecting cyclist endurance cannot be observed if cyclists could wait until the permission of traffic rules.However, most statistical models are unable to analyze these uncensored data.Accordingly, cyclists' waiting times are modeled by a proportional hazard-based duration model.The covariates relevant to traffic conditions and personal features are investigated to capture the influenced factors of cyclist behavior.The results give the time when cyclists are easy to violate traffic rules and the significantly influential factors on waiting behavior.The findings will imply some effective countermeasures for improving the road safety of urban intersections.

Duration Model
The variable of interest in duration model is the survival time that elapsed from the beginning of an event until its end.The waiting time of a cyclist at red light can be regarded as the waiting duration that starts when a cyclist arrives at the intersection at the red period and ends when the cyclist begins to cross the intersection.
Let T be a nonnegative variable representing the waiting duration of a cyclist at a signalized intersection.Let f t denote the probability density function of T and the cumulative distribution be

2.3
Note that the waiting time of a cyclist is influenced by various factors.The influential factors can be defined as a vector of explanatory variables, x x 1 , x 2 , . . ., x p .To accommodate the effects of these influential factors is a main objective of this paper.Thus the proportional hazard form is introduced, which specifies the effects of explanatory variables to be multiplicative on a hazard function where h 0 t is called the baseline hazard function and can be interpreted as the hazard function when all covariates are ignored.g • is a known function to represent the effects of explanatory variables, β β 1 , β 2 , . . ., β p is a vector of estimable coefficients for x.In this paper, a typical specification with g x, β exp βx , which was proposed by Cox 22 , is used.The specification is convenient since it guarantees the positivity of the hazard function without placing constraints on the signs of the elements of β.The Cox proportional hazard model is Combining 2.3 and 2.5 , the survival function can be written as where H 0 t t 0 h 0 w dw represents the baseline cumulative hazard function.Thus, the covariates can be incorporated into the survival function.

Model Estimation
The main interest of this paper is to identify from the p covariates a subset of variables that affects the hazard more significantly, and consequently, the waiting duration time at a signalized intersection.We are concerned with the regression coefficients.If β i is zero, the corresponding covariate is not related to the waiting time.If β i is not zero, it represents the magnitude of the effect of x i on hazard when the other covariates are considered simultaneously.
To estimate the coefficients, β 1 , β 2 , . . ., β p , a partial likelihood method is adopted.Suppose that k of the duration times from n cyclists is observed and distinct.Let t 1 < t 2 < • • • < t k be the ordered k distinct duration times with corresponding covariates x 1 , x 2 , . . ., x k .Let R t i be the risk set at time t i .R t i consists of all cyclists whose duration times are at least t i .For the particular duration time t i , conditionally on the risk set R t i , the probability is

2.7
Each distinct duration time contributes a factor and hence the partial likelihood function is and the log-partial likelihood is The overall goodness of fit of the model estimation is determined by the likelihood ratio LR statistics, which is specified as

Covariate Selection
The covariate selection takes into account the previous research 13, 14 and arguments regarding the effects of the exogenous variables on cyclist crossing behavior.The practical effects on waiting behavior and the feasibility of data acquisition are considered in the covariate selection.Two broad sets of variables are considered as covariates: personal characteristics and traffic conditions.Personal characteristics involve age and gender.The selected covariates of traffic conditions can determine the effects on the waiting time and traffic volume.The following covariates, as shown in Table 1, are adopted to construct the duration model.

Site Survey Design
To record cyclists' waiting durations, the whole red-light period of a signal cycle was observed as a data collection unit.Only the cyclists who arrived in the red-light period were defined as a valid sample.The waiting duration was from the time a cyclist arrived at the crossing location to the time he/she began to cross.It can be classified into two kinds: uncensored data and censored data.The uncensored data was defined as the waiting duration which ended within the red-light period violating crossing .Otherwise, the waiting duration was called as the censored data as long as it ended within the green-light period normal crossing .For the censored data, it is unknown about the exact waiting duration which can reflect the endurance of waiting time for cyclists.The site survey was conducted at three selected signalized intersections near Jiaotong University in Beijing, China.Data collections were done by placing video cameras at each location.The survey periods included peak hour 7:30 a.m.-9:30 a.m. and offpeak hour 10:00 a.m.-4:00 p.m. .The survey area covered the zebra crossing and a part of traffic lanes so that the cyclist crossing behavior and the corresponding traffic conditions can be monitored clearly.Some additional explanations are needed for the site survey.
a The signals were old traditional person heads so that the influence of type of signals could be neglected 7 .The selected sites had similar characteristics involved geometric, and traffic conditions, traffic control.
b The survey was conducted in good weather and the absence of pointsmen.Cyclists were unobtrusively observed.

Descriptive Statistics
Of the 459 valid observations, 295 64.27% cyclists violated the traffic regulations.The average waiting time of all samples was 25.16 seconds, with a standard deviation of 27.13 seconds.The average waiting time of the violating crossing was 15.71 seconds while the average waiting time of the normal crossing is 43.14 seconds.The maximum waiting duration was 116 seconds while the minimum was 0 second.The latter means people cross the street without any wait.This descriptive statistic cannot reflect the exact waiting behavior due to the neglect of the censored data.The estimation of the waiting duration with censored data will be discussed later.

Empirical Results
The results are discussed in two sections.The overall results are presented in the first section including model fit statistics and survival probability estimation.The second subsection presents the effects of covariates.low.The significance level of each covariate suggests that the importance of covariate should be interpreted carefully.

Overall Results
2 Survival probability: Figure 1 gives the survival probability calculated by the duration model, which represents the probability of complying with the traffic rules while waiting at the signalized intersections.The survival probability for estimated model presents a general decline trend with elapsed waiting duration.The survival probability can be divided into three parts according to the gradient.Firstly, a sharp decline for the short duration indicates that there are a number of cyclists would violate to cross without any delay.Especially, about 32 percent of cyclists can be defined as risk takers since they show high violation inclination and very low waiting endurance <3 seconds .Then, the probability decreases smoothly from 3 seconds to 95 seconds.This steady reduction reflects that the number of cyclist violations is increasing continuously.The declining trend of the survival probability indicates that the red-light running behavior of most cyclists is time dependent.It means that cyclists are easy to end waiting duration and violate the traffic rules with the elapsed duration.Note that about half of the observed cyclists cannot endure 29 seconds or longer.Finally, there are 15 percent of cyclists who wait and wait longer, and they are generally non-risk takers.
3 Figure 1 also gives the comparison between the estimated survival probability and the observed survival probability.Here, the estimated results are calculated by the Cox proportional hazard model; while the observed results are calculated by the nonparametric approach in which the covariate effects are not considered.The detailed discussion of the non-parametric approach can refer to the work of Lee and Wang 16 .The results show that there are some differences between them though the general shape is the same between the two results.Specifically, compared to the estimated results, the observed survival probability is smaller until about 24.0 s, larger thereafter.This difference is expected to be the covariate effects, at least partly.The observed results indicate the waiting time under the specific condition for individual sample, while the estimated results indicate the waiting time under the average condition for all the samples.The estimated survival probability reflects the Discrete Dynamics in Nature and Society characteristics of the waiting time which has an average value for every variable.Any change of the variables could influence the estimated results.The effects of variables are discussed in the next subsection.

Analysis of Covariate Effects
According to 2.5 , the effects of the explanatory variables can be interpreted by the signs of the coefficients in a rather straightforward fashion.If the coefficient is negative, it implies that an increase in the corresponding variable decreases the hazard rate, or equivalently, increases the waiting duration.With regard to the magnitude of the variable effects, when a variable changes by one unit, the hazard would change by exp β − 1 × 100%.
To assess the effects of the explanatory variables on the duration time, a function of hazard ratio HR can be obtained by dividing both sides of 2.5 by h 0 t , yield where the x's are covariates for the ith cyclist; β β 1 , β 2 , . . ., β p is a vector of the coefficients which has been estimated by using the Cox proportional hazard model.The left side of 4.1 is a function of hazard ratio HR and the right side is a linear function of the covariates and their respective coefficients.The HR can represent the multiple relations between the hazard under the covariate effects and the hazard when all variables are ignored.
If the covariates are standardized about the mean and the model used is where x x 1 , x 2 , . . ., x p and x j is the average of the jth covariate for all cyclists, the left side of 4.2 is the logarithm of the relative hazard ratio RHR .RHR represents the hazard ratio for a cyclist with a given set of values to that for a cyclist which has an average value for every covariate.If RHR is more than one, it means that the covariate effects can increase the hazard and so the variables are favorable.That is to say, the waiting time in such a favorable condition is less than the average level of the survey sample.On the contrary, the unfavorable variable corresponds to a low hazard.Therefore, a cyclist in the unfavorable condition would have longer waiting time than that in the favorable condition.
In order to make a quantitative analysis on the effects of covariates, the relative hazard for each variable is calculated by considering favorable or unfavorable values of that variable, assuming that other variables are at their average value.The favorable or unfavorable values of that variable are given on the basis of the hazard with the value of the variable.The value of the variable with the low hazard is regarded as the favorable condition.Take the age as an example, old people are defined as the favorable condition since old people have lower violation risk than young people.The assumed conditions and corresponding RHRs and HRs are shown in Table 3.The RHRs for three continuous covariates are shown in Figure 2.
The effect of age AG indicates that older cyclists have longer waiting time.This is partly because older cyclists have stronger risk consciousness of traffic violations.In addition, older cyclists' trip purposes are seldom related to work or school so they are not in a hurry.Note that this conclusion is a statistical result; traffic violations involved older cyclists are also common sometimes.
The effect of gender GEN indicates that male cyclists have shorter waiting time and higher tendency to disobey the traffic rules.They are 1.38 times more likely than females to have shorter waiting times.Hamed reported that male pedestrians are 2.61 times more likely than females to have shorter waiting times 23 , and other qualitatively similar results were obtained by Tiwari et al. 24 .
The effect of covariate nonmotor vehicle type NT indicates that cyclists of electric bike have shorter waiting time and higher tendency to disobey the traffic rules.They are 1.60 times more likely than human-powered bicyclists to have shorter waiting times.
The waiting time of cyclists would increase with the bigger number of other cyclists that are waiting for a green, WN when arrives see Figure 2 a .Otherwise, the waiting time decreases with the bigger number of other cyclists that are crossing against the red light CN when arrives see Figure 2 b .This is caused by two reasons.First, many people may consider that the more people cross together, the safer they would be.They take it for granted that drivers must yield to a group of people more often than one person.Second, the conformity psychology would work well in dense cyclist environments.
The effect of covariate TC twice crossing shows that the cyclists of twice crossing have higher hazard and shorter waiting time.Cyclists who are apt to twice crossing behavior have little or no patience to wait at a red light.They are 2.35 times more likely than one-time crossing cyclists to have shorter waiting times.
The effect of covariate MV motor vehicle volume indicates that heavy traffic can increase waiting time or decrease the risk of cyclist violations see Figure 2 c .This is because that the larger motor vehicle volume is, the smaller the average time gap between successive cars is.
The characteristics of travel time also have impacts on cyclists' red-light running behavior.It indicates that cyclists are at high risk level of traffic violation in peak hour.The cyclists who travel in peak hour are 1.44 times more than those who travel in offpeak hour to end waiting duration and cross illegally.In peak hour period, both cyclists and drivers are in a hurry to the destination related to work or school, so the heavy mixed traffic with impatient cyclists and drivers would cause traffic accidents easily.

Conclusions
This paper applies a proportional hazard-based duration model to study the cyclist crossing behavior at signalized intersections by using data acquired in Beijing, China.The crossing behavior is examined by modeling the duration between the arrival at survey area and the start to cross the intersection.If cyclists violate the traffic rules to cross the intersection, their waiting times are recorded as uncensored data, while the waiting durations of normal crossing are recorded as censored data.
The paper provides several important insights into the determinants of the regularity and frequency of cyclist crossing behavior, especially the relation between violation behavior and waiting duration.First, the results indicate that the crossing behavior of cyclists is time dependent, as well as the risk of traffic violation.Cyclists' crossing behavior presents positive duration dependence, which also implies a "snowballing" effect.It means the longer the time has elapsed since the start of the waiting duration, the more likely cyclists will end the wait soon.Such positive duration dependence also indicates that longer waiting time would increase the risk of cyclist violation.Second, some crucial time points deserve our concern: 3 seconds and 95 seconds.The 3 seconds indicate cyclists who are at high risk of violating crossing the street and low waiting time, and they account for 32% of the sample in the study.The duration of 95 seconds reflects the cyclists' endurance.About 15 percent of all the cyclists can obey the traffic rules after waiting for 95 seconds.These people are generally non-risk takers.Third, the human factors and the external environment play an important role in red-light running behavior.Various factors in the unfavorable condition could increase the risk of traffic violation, as well as traffic accidents.The effects of covariates can help to modify cyclists' crossing behavior.Specifically, rational traffic planning and designing should fully consider cyclist behavioral characteristics.More importantly, minimizing the effects of unfavorable condition involved human factors may be an effective measure to obtain conscious cooperation and behavioral changes of cyclists.Finally, it is noted that, for different cities, the model should be estimated by using the specified field data.Additionally, the explanatory variables can be chosen flexibly according to the research aim and the traffic reality.
In terms of the future work, more parameters under different situations should be taken into account.Next, some engineering solutions should be proposed to improve the safe crossing behavior of cyclists in urban traffic environment.In addition, from the viewpoint of cyclist prevention, the interaction between cyclists and motor vehicles could be analyzed based on such crossing behavior.Findings from this paper may partly supplement previous research which helps us in inspiration.It is also hoped that these findings may give better understanding of cyclist behavioral characteristics at signalized intersections and help to plan and design proper facilities for non-motorized vehicles.

Figure 2 :
Figure 2: Relative hazard ratios for three continuous variables.

Table 1 :
Covariates selection and explanation.
where l β 0 is the log-partial likelihood for null model with all the regression coefficients are set as zero and l β is the log-partial likelihood at convergence with p regression coefficients.The Cox proportional hazard model has been widely cited in the literature.For the estimation of H 0 t and other detailed discussion of this model see, Lee and Wang 16 andBhat 17 .

Table 2 :
Estimation in waiting duration model.

Table 2
. From the results, most of the included covariates are statistically significant at the 0.10 level of significance.It means that these covariates are significantly related to violation behavior.Only gender has relative low significant level.It is partly because that the female rate 24.2% in the sample is relative

Table 3 :
Estimation of RHRs and HRs for assumed covariates.