Investigating Travel Flow Differences between Peak Hours with Spatial Model with Endogenous Weight Matrix Using Automatic Vehicle Identification Data

,


Introduction
Te rapid urbanization process in China has brought much challenges to urban transportation network, especially the excessive travel fow at peak hours. It not only adds travel time cost but also lowers network efciency. However, travel fow at peak hours is not always the same. People may commute to work in the morning peak while they go for entertainment in the evening peak. Te travel fow patterns at peak hours can be totally diferent. Terefore, it is worthwhile to study the origin-destination (OD) travel fow patterns at peak hours and discover key infuential factor, which will provide important insights for the trafc management and urban planning.
Many studies have been conducted on the travel fow analysis in peak hours. However, few studies focus on the travel fow diferences between peak hours. Te travel fow pattern could be diferent during morning peak and evening peak. It is important to capture such diferent travel patterns and fnd out infuential factors. Jia et al. [1] pointed out that A.M. peak urban trafc could be diferent from P.M. peak. Te P.M. peak was usually regarded as the mirror image of A.M. peak trafc. Fosgerau and Fukuda [2] confrmed that heteroskedasticity phenomenon trafc conditions in the peak periods are more variable than those in the regular periods. Ni et al. [3] noted that the travel fow features cannot be captured by merely integrating the A.M. and P.M. travel fows. Urban travel fow has diferent patterns in morning peak and evening peak hours due to diferent travel purposes such as commuting or entertainment. For example, Hu et al. [4] used the taxi GPS trajectory data to analyze the coupling relationship between regional taxi demand and social development by building the coupling coordination degree model (CCDM). Results showed that in the morning rush, taxi orders fow from residential to ofce; the reverse applies during the evening rush.
Despite these studies, few studies have considered spatial correlations in travel fow analysis. Spatial correlations exist among travelers' mode choice, daily trafc, and peak hour travel and have been confrmed by previous studies [5][6][7]. Ni et al. [3] analyzed urban travel fow by developing a spatial autocorrelation model based on mobile phone data in Hangzhou, China. Results concluded that the model ignoring the spatial autocorrelation tends to underestimate the impacts of infuence factors on travel fows. Chu et al. [8] developed a multiscale convolutional long short-term memory network (MultiConvLSTM) deep learning model, which considered both temporal and spatial correlations to predict the future travel demand and OD fows. However, few studies have considered endogeneity in travel fow analysis. Endogeneity exists among trafc participants and has been studied by many researchers [9][10][11]. However, traditional spatial models assume an exogenous weight matrix, which may lead to biased estimation or even false conclusions. Terefore, it is critical to consider spatial correlations and endogeneity in urban travel fow analysis to better capture travel fow patterns and quantify impacts of key factors.
To the best knowledge of the authors, few existing studies have systematically investigated OD travel fow diferences between peak hours while considering spatial correlations, not to mention endogeneity due to its complexity. Tis paper addresses the endogeneity issue by using a spatial binary probit model with endogenous weight matrix (SARBP-EWM) to identify such endogeneity and quantify the impacts of key factors based on automatic vehicle identifcation (AVI) data collected from Xuancheng, China.

Urban Travel Flow Analysis.
Many studies have been conducted on urban travel fow analysis. Various land use variables and facility variables were found to have an infuence on urban travel fow. Tsai et al. [12] investigated the relationship between public transport demand and land use characteristics in the Sydney Greater Metropolitan Area using a geographically weighted regression (GWR) approach. Results revealed that impacts of land use characteristics on public transport demand vary spatially. Zhou and Wang [13] developed structural equation modeling (SEM) to investigate relation between online shopping and shopping trips and other infuential factors. Tey found that residents' location in urban center is strongly associated with their propensity to shop online, which leads to increase of shopping trips. Wu et al. [14] proposed a novel spatiotemporal random efects (STRE) model to predict urban travel fow using data from loop detectors. Zheng and Liu [15] used connected vehicle (CV) trajectory and signal status data to estimate trafc volumes at signalized intersections. Li et al. [16] built a deep feature fusion model to predict space-mean-speed using heterogeneous data. Samara et al. [17] proposed a novel approach for estimating vehicle travel time distribution using copula-based discrete convolution.
Although numerous studies have been conducted on urban travel fow analysis, few studies have focused on travel fow at peak hours. Kumar and Vanajakshi [18] performed short-term trafc fow prediction by establishing a seasonal autoregressive integrated moving average (ARIMA) model with limited input data. Results indicated that travel fows in morning peak were larger than evening peak in three consecutive days. Yang and Qian [19] acknowledged correlations of travel time between morning peak and afternoon peak. Morning peak travel time was used when predicting afternoon travel time. Shen et al. [20] investigated car travel demand in Hangzhou by establishing a geographically and temporally weighted regression (GTWR) model. Te results confrmed peak phenomena and found that the infuence of built environment and household properties on car travel demand varies with space and time. In general, few studies have been conducted on travel fow at or between peak hours. Further studies are need to investigate travel fow patterns at peak hours.

Spatial Efects in Travel Flow
Analysis. Spatial efects have major impacts on urban travel fow. Many studies have been conducted to consider spatial efects in travel fow analysis. Bhat and Zhao [21] performed spatial analysis of activity stop generation by establishing a multilevel mixed logit model to address the spatial heterogeneity across TAZs. LeSage and Pace [22] proposed spatial weight structure including origin dependence, destination dependence, and OD dependence in the standard spatial autoregressive models for analyzing fow patterns. LeSage and Tomas-Agnan [23] further provided expressions for calculating partial derivatives of the above model to quantify spatial dependence between OD fows. Te model was then applied to analyze commuting fows of 60 regions in Toulouse, France. Kerkman et al. [24] analyzed spatial dependence among public transit passenger fows in an urban region in the Netherlands by developing fve distinct spatial interaction models (SIMs). Results indicated that spatial autocorrelation efects cannot be neglected in travel fow analysis. Zheng and Geroliminis [25] formulated an optimization framework of equitable congestion pricing schemes for multimodal networks with heterogeneous population. Te results justifed the need for a value-of-time (VOT)-based pricing among groups with diferent behaviors and cost savings.
Despite numerous studies on spatial efects, few studies have considered endogeneity in travel fow analysis. Endogeneity occurs when one's travel behavior and the whole network afect each other, which will lead to biased estimation or even false conclusions. Mokhtarian and Cao [6] reviewed major methods on addressing the residential self-selection issue on travel behavior. Schatzmann et al. [26] applied a spatial autoregressive model to study OD public transportation commuting fows between municipalities in Switzerland. Tey used an instrumental variable approach to account for endogeneity and showed that income diferences are underestimated in the gravity and spatial models if assumed exogenous. Guevara et al. [27] proposed the multiple indicator solution (MIS) method in a stated preference (SP) experiment to correct for endogeneity due to omitted crowding in public transport choice. Results suggest that endogeneity issue may arise if indicators are only weakly correlated with the omitted attribute. Guerrero et al. [28] proposed a control function updated (CFU) method to correct endogeneity issue in transport modeling. Te results indicated that the new CFU approach showed statistically signifcant improvements over the classical approach in all scenarios tested.

AVI Data in Travel Flow
Analysis. With the rapid development of telecommunications technology, new types of data with spatial and temporal information such as call detail record (CDR) data, location-based social media (LBSM) data, and taxi GPS data emerge and are receiving more attention in recent years [29][30][31][32]. Besides these data sources, the AVI data are considered an emerging data source in recent years. Te AVI provides rich information including detector number, license plate, vehicle passing time, and time stamp, which could be used to reconstruct the vehicle trajectory and obtain OD fow information. Ahmed and Abdel-Aty [33] used AVI data for real-time crash prediction research. Te results showed that the likelihood of a crash is statistically related to speed data obtained from AVI segments. Zhan et al. [34] proposed a queue length estimation model using license plate (LPR) recognition data, which provided an efcient queue length estimation at the lane level in real time. Zheng et al. [35] used data from automated number plate recognition (ANPR) cameras to study the travel time reliability. Fekih et al. [36] proposed a framework to extract dynamic trip fows and travel demand patterns from cellular signaling data to estimate aggregate trip by time of day.
As for the travel fow studies using AVI data, Chen et al. [37] proposed a copula-based approach to model arterial travel time distribution (TTD), which was examined with AVI data and next generation simulation (NGSIM) trajectory data. Te results demonstrate the advantage of the proposed copula-based approach. Zhao et al. [38] collected 24 h AVI data from Wuhan, China, to investigate weekly travel patterns of private vehicles and identifed four types of commuters. Te results revealed six variations of the travel demand on weekdays and weekends. Huang et al. [39] proposed a semisupervised deep learning based model that appropriately combines both AVI and smartphone trajectory data during training. Te model can provide OD estimation and prediction services on larger spatial areas beyond the limited spatial coverage of AVI data. Cao et al. [40] proposed a new method to recover day-to-day dynamic OD fows using both connected vehicle (CV) trajectories and AVI observations. Te results indicated that the proposed method requires very few AVI detectors and CV trajectories to achieve competitive estimation performance against two benchmark models.

Data Description
Two major data sources are used in this paper. First, the OD fow data of morning and evening peaks are extracted from AVI data gathered from Xuancheng City, Anhui Province, China. Te AVI system automatically identifes license plates when vehicles leave stop line and stores relative message as AVI data. Each AVI record mainly includes information such as detector number, license plate, vehicle passing time, and time stamp. Besides, the vehicle trajectory can be reconstructed based on the above information. In addition, the OD fow data, which contain license plate number, departure time, arrival time, and OD pair, are further generated from trajectory data. Based on process above, we fnally extracted the OD fow data between 7:30 A.M. and 10:30 A.M. in morning peak and between 5:30 P.M. and 8:30 P.M. in the evening peak in four consecutive weeks from September 2 to 29, 2019. Te original OD pairs recorded in the AVI database are stored in the form of roads. To facilitate this study, each OD road was further identifed and matched to the corresponding TAZ. Finally, we take the average trafc volume of 20 workdays and 8 weekends, respectively, to obtain the average travel fow of each TAZ.
Second, the key urban facilities such as ofces, supermarkets, schools, hospitals, hotels, sports centers, and bus stations have major infuence on the residents' mobility. In this paper, the points of interest (POIs) of these diferent key facilities were obtained on the AutoNavi Map [41] by using web crawler technology. Te whole Xuancheng area is divided into 32 TAZs. Te above travel fow data and facility data are integrated for each TAZ.

Dependent Variable.
Te dependent variable needs to represent the travel fow diferences between morning and evening peaks. First, total travel fow for 32 TAZs is calculated. As stated above, the average OD travel fows between TAZ i and TAZ j are derived from the AVI database. Te total outgoing travel fow from TAZ i is calculated by taking sum of outgoing travel fows q out ij with origin TAZ i. Similarly, the total incoming travel fow to TAZ i is calculated by taking sum of incoming travel fows q in ji with destination TAZ i. Terefore, the total travel fow to and from TAZ i is calculated as follows:

Journal of Advanced Transportation
Te mere absolute travel fow diference is sensitive to various factors. For example, travel fow in TAZs with larger areas may vary to a large extent, while travel fow in TAZs with smaller areas may only change in a small degree. Terefore, the relative travel fow fuctuation is derived to measure the relative travel fow diferences between peak hours, where the diference between morning and evening peak travel fows is divided by evening peak travel fow to obtain the travel fow fuctuations. Te travel fow fuctuation is calculated as follows: where F i represents the travel fow fuctuation between morning peak and evening peak for TAZ i, Q m i and Q e i represent the travel fow of TAZ i in the morning peak and evening peak, respectively, and Q m i and Q e i are calculated using (1) by adding the inbound trafc and outbound trafc of TAZ i in the morning peak and evening peak, respectively.
Since the travel patterns may be diferent on weekday and weekend, the average weekday (20 workdays) travel fow fuctuation and average weekend (8 weekends) travel fow fuctuation are calculated, respectively, in the paper. Te travel fow fuctuation distributions for 32 TAZs on weekday and weekend are illustrated in Figures 1 and 2. In addition, the cumulative distributions of average weekday and weekend travel fow fuctuations are calculated and presented in Figure 3 for further analysis.
Generally speaking, TAZs in suburban areas have larger fuctuations, which indicates that there are signifcant travel fow diferences between morning and evening peaks, while downtown areas have relatively smaller fuctuations. For TAZs that have negative fuctuations, they are mainly located in city central areas.
Te cumulative distributions in Figure 3 show that both weekday and weekend fuctuations vary greatly from −20% below to 60% above in general. It indicates that travel fow fuctuations between morning peak and evening peak among TAZs vary to a large extent. According to Figure 3, since less than 10% fuctuations are negative, most travel fow fuctuations are positive, suggesting that morning peak travel fow is dominating evening peak travel fow in most TAZs. Besides, weekday fuctuations are slightly larger than weekend fuctuations in general for given cumulative percentage.
To determine whether there is signifcant diference between morning and evening peak travel fows, a threshold needs to be determined. Since the travel fow at weekday and weekend has diferent fuctuation levels and may have different travel patterns, it is not appropriate to set a unifed fuctuation level for them. Fluctuation threshold needs to be determined separately for weekday and weekend to achieve balanced fuctuation quantiles and eliminate random errors. As shown in Figure 3, weekday fuctuations vary from −40% to 60%. To maintain balanced distributions of fuctuations, a 30% fuctuation threshold is set for weekday, which corresponds to roughly 50%. It means that travel fow fuctuations with negative fuctuations and fuctuations less than 30% are considered insignifcant, taking up about 50%, while fuctuations larger than 30% are considered signifcant diference between morning and evening peaks, taking up equally about 50%. Similarly, a 20% fuctuation threshold is set for weekend fuctuations, which corresponds to about 50% to achieve even distributions for fuctuations smaller and larger than the threshold.
Based on the thresholds above, a binary dependent variable is defned to determine whether morning peak travel fow is signifcantly greater than evening peak travel fow. Te binary dependent variable is defned as follows: where Y * i is the binary dependent variable indicating whether morning peak travel fow is signifcantly greater than evening peak travel fow for TAZ i. α is the fuctuation threshold. As discussed above, it is set as 30% for weekday fuctuations and 20% for weekend fuctuations. Te dependent variable takes a value of 1 if the absolute value of travel fow fuctuation is larger than the threshold and takes a value of 0 otherwise. Dependent variables for both weekday and weekend travel fow diferences are generated using equation (3). Te spatial distributions of dependent variable for both weekday and weekend are illustrated in Figures 4 and 5. Percentage of dependent variable taking values of 1 and 0 for both weekday and weekend is calculated, respectively, and summarized in Table 1.
According to Figures 4 and 5, most TAZs reveal signifcant travel fow diferences in suburban TAZs on both weekday and weekend. For travel fow diferences on weekday, 16 out of 32 TAZs take value 1 according to Table 1, taking up 50.0% of all 32 TAZs. Tese TAZs with signifcant travel fow diferences are mostly located in suburban areas. Only a few TAZs with signifcant travel fow diferences are located in downtown areas. Similar travel patterns are also discovered in weekend travels as shown in Figure 5, where 19 TAZs show signifcant travel fow diferences in suburban areas, taking up 59.4% of all 32 TAZs. Te rest of the 13 TAZs with insignifcant travel fow diferences are all located in downtown areas, taking up 40.6% of all 32 TAZs. For weekend travels, TAZ 2, TAZ 9, and TAZ 14 change from being insignifcant on weekday to being signifcant, while the rest remain the same on both weekday and weekend.
Spatial distributions above show that travel fows have greater diferences in suburban areas than central areas across weekday and weekend. Such phenomena have been found and investigated by previous studies [16,18,20,42]. It is possible that central areas have more regular travel patterns due to commuting travels, while suburban areas have more random travel patterns. For Xuancheng, this phenomenon is uniform across many TAZs. Tis paper reveals the mechanism of such phenomenon and quantifes the infuence of potential factors.
To examine the spatial efect of travel fow diferences among TAZs, Moran's I of dependent variable is calculated at 0.025 with p value 0.027. It indicates that there are signifcant positive spatial correlations among dependent variables. Also, TAZs with signifcant travel fow diferences tend to cluster. Terefore, the spatial correlations should be taken into account to explore the infuential factors of travel fow diferences.

Explanatory Variables.
Explanatory variables based on current dataset include public facilities such as ofce, supermarket, middle school, clinic, inn, and sports center, which can have major impacts on travel fow. To avoid multicollinearity issue, we choose only one variable for each type of facility. Also, only variables used in the fnal model are presented. Such model could be extended by incorporating new types of variables which provide meaningful results. For these facilities, this paper uses point of interest (POI) data from online map service to obtain detailed number of facilities for each TAZ [41]. Distribution of these facilities is shown in Figure 6.

Journal of Advanced Transportation
Ofces are the major sources of travel fow as people need to go to work during morning peak and go back home in the evening peak. Te commuting travel fow induced by ofce buildings takes large portion of total travel demand and may cause various issues like trafc congestion [43]. Te spatial distribution of ofces is illustrated in Figure 6(a) where ofces are found to cluster densely in city center areas while there are fewer ofces in suburban areas. Tis is contrary to the spatial distributions of travel fow diferences in Figures 4 and 5 where city center areas have no signifcant travel fow diferences and suburban areas have major travel fow diferences. For example, TAZs in city center area like TAZs 1, 20, and 27 have higher number of ofces but correspond to insignifcant travel fow diferences.
Supermarkets are major sources of shopping activities, attracting numerous people to travel to buy life necessities every day. Supermarkets are very popular in China as they are located in almost every community and play major roles in providing daily services for local residents. Researchers have studied the impacts of supermarket on travel fows [44,45]. Terefore, the number of supermarkets in TAZs is considered in this paper. According to Figure 6(b), TAZs in outskirt areas have higher number of supermarkets, which are consistent with travel fow difference patterns shown in Figure 4. For example, TAZs in outskirt areas like TAZ 25 and TAZ 26 have relatively higher number of supermarkets which correspond to signifcant travel fow diferences according to Figure 4, while TAZ 2, TAZ 3, and TAZ 6 have smaller number of supermarkets and correspond to insignifcant travel fow diferences.
Schools are considered another major source of travel fow. Students go to school in the morning and go back home in the evening. Students themselves and accompanying     Journal of Advanced Transportation 7 Journal of Advanced Transportation parents generate large travel demand. Terefore, schools cannot be neglected in this study. Ikeda et al. [46] discovered that schools can generate active school travels under certain built environment conditions. Te spatial distribution of middle schools is presented in Figure 6(c) where suburban TAZs have higher number of middle schools than urban TAZs. Such spatial distributions are consistent with travel fow diference distributions with outskirt areas being signifcant. For medical facilities, clinics are considered another source of travel fow. Clinics exist in almost every community in China now, which provide necessary medical services for local residents. Tey together with high-grade hospitals have medical systems to provide medical services from all levels. Cheng et al. [47] analyzed the spatial correlation of residents' accessibility to these medical facilities and confrmed spatial imbalance among these hospitals. Similarly, the spatial distribution of clinics is shown in Figure 6(d) where suburban TAZs, especially TAZs in east areas like TAZ 20, TAZ 21, TAZ 22, and TAZ 25, have higher number of clinics than urban TAZs.
Besides, recreation facilities like inns and sports centers also generate substantial travel demand. People visiting Xuancheng for business or tourism would choose to stay in local inns. Such recreation facilities can generate travel fow, which had been studied by previous research. Schirpke et al. [48] mapped recreation fows in the Alpine Space area and found signifcant spatial pattern diferences between mountain areas and lowlands. Similarly, the spatial distributions of inns are illustrated in Figure 6(e) where suburban TAZs have relatively higher number of inns than central areas. In addition, sports centers show similar patterns according to Figure 6(f ) where suburban TAZs in east and south areas have higher number of sports centers. People travel to sports centers in leisure time for relaxation, which generates potential travel demand.

Endogenous Variable.
By considering endogeneity, an endogenous variable needs to be defned to represent the interdependency structure among TAZs. Tis paper chooses population density as the endogenous variable. Population density is the necessary condition for regional development and social and economic activities, which represents the urbanization level of TAZs. Te population density diferences between TAZs can indicate the general urbanization level diferences. Te endogeneity of population density has also been studied by previous research. Zhao and Kaestner [49] addressed the possible endogeneity of population density by using a two-step instrumental variable approach to investigate the efects of urban sprawl on obesity. Terefore, population density is chosen as endogenous variable in this paper. Te population density distribution is illustrated in Figure 7.

Indicator Variables for Endogenous Variable.
Te endogenous variable could be infuenced by various indicator variables. Road density refects regional characteristics and has potential infuence on the endogenous variable. Terefore, road length is chosen to be the indicator variable for the endogenous variable. Te spatial distribution of road length is illustrated in Figure 8.
According to Figure 8, it is clear that urban TAZs have higher road density in city center. For TAZs in urban areas, they have relatively lower road density. Tis is understandable as urban areas tend to have more connected roads for travel and economic activities, while suburban TAZs have much larger areas and fewer roads, which lead to lower road density. Another important explanatory variable for endogenous variable is transit accessibility. Transit accessibility ensures necessary passenger travels and goods exchange, thus maintaining normal operation of the city. It has potential infuence on the endogenous variable. In minor cities like Xuancheng, the major transit mode is bus. Te bus stop density represents the availability of bus transit in TAZs. Terefore, the bus stops density is chosen as explanatory variable for endogenous variable. Te spatial distributions of bus stop density are illustrated in Figure 9.
As seen from Figure 9, TAZs in city center areas have higher bus stop density, while suburban TAZs have relatively lower bus stop density. In practice, city center areas have more frequent transit travel demand, which leads to higher bus stop density in downtown areas, while suburban areas    have fewer bus stops and larger areas, thus having lower bus stop density.

Travel Time.
Te travel impedance between TAZ pairs is also defned to indicate spatial correlations between TAZ pairs. By considering comprehensive travel impedance between TAZ pairs, this paper chooses average travel time in the A.M. to indicate travel impedance between TAZ pairs. Travel time is derived by taking average of each vehicle's travel time between TAZ pairs, which is obtained based on trajectory from AVI database.
Te average travel time between TAZ i and TAZ j is calculated as follows: where T ij is the average travel time between TAZ i and TAZ j; t k ij is the travel time for vehicle k between TAZ i and TAZ j; and N ij is the number of vehicles traveling from origin i to destination j. Te statistics of the above explanatory variables are summarized in Table 2. Tere is no multicollinearity between explanatory variables, as the maximum variance infation factor (VIF) is 4.44. Te travel time has 1,024 observations since it is defned for each TAZ pair. All other explanatory variables are defned at the zonal level and have 32 observations.

Model Establishment.
A spatial autoregressive binary probit model with endogenous weight matrix (SARBP-EWM) is established in this paper to investigate the travel fow diferences between peak hours. Te SARBP-EWM model was frst proposed by Zhou et al. [50] to account for the endogeneity problem in spatial econometrics. By applying the SARBP-EWM model, the time subscript t in the original model is set to 1 since this paper only focuses on average daily travel fow. Besides, the geographic fxed efect M and state fxed efect N are held as constants. Terefore, the SARBP-EWM model is reduced as follows: where dependent variable Y * � (y * 1 , y * 2 , . . . , y * n ) ′ is a n × 1 vector of binary variable indicating the travel fow diferences (Y_weekday and Y_weekend) between morning peak and evening peak in TAZs. y * i � 1 if travel fow fuctuation is larger than threshold for TAZ i according to (3); y * i � 0 otherwise.
For other variables, X is n × k 1 matrix of explanatory variables. W is an n × n spatial weight matrix that represents the relative weights between all TAZ pairs. Te weights are defned by the endogenous variable Z. β is k 1 × 1 the vector of coefcients for corresponding explanatory variables. As stated above, the explanatory variables include number of ofces (ofce), number of supermarkets (supermarket), number of middle schools (mid_school), number of clinics (clinic), number of inns (inn), and number of sports centers (sports). E � (ε 1 , ε 2 , . . . , ε n ) ′ is an n × 1 error term vector of (5).

Defnition of Weight Matrix.
Te weight element w ij stands for the element on the i th row and j th column of weight matrix W, which indicates the relative weight between TAZ i and j. As noted in (5), W is defned as the function of the endogenous variable Z so that where F(•) is the function that can take various forms such as a generalized Euclidean distance or a gravity model that considers key socioeconomic factors. z i � (z i1 , z i2 , . . . z ip ) is a 1 × p vector of TAZ i's demographic or economic characteristics. Common ways to defne weight element include using geographic distances, social network [51], length of road [52], or various economic quantities among variables [53]. For example, Lee and Yu [54] defned w ij by combining the demographic and economic distance with geographical distance.
In this paper, we defne the weight element by replacing the geographical distance with travel time. So, the weight element w ij in this paper is defned as follows: Since only population density (pop_den) is chosen as the endogenous variable, p � 1. Te weight element w ij is only defned by travel time T ij and diferences of population density z i − z j between TAZ i and j. Andc � (c 0 , c 1 , c 2 , . . . , c p ) are (p + 1) × 1 estimable parameters which correspond to travel time and population density diferences, respectively. T ij is the average travel time between TAZ i and TAZ j in (4).

Defnition of Endogenous Variable.
Te SARBP-EWM model has a distinct feature to incorporate an endogenous variable and its infuential factors through entry equation as follows: Te endogenous variable Z � (z 1 ′ , z 2 ′ , . . . , z n ′ ) ′ is an n × p matrix consisting of p endogenous variables representing the interdependency structure among TAZs. Also, z i � (z i1 , z i2 , . . . z ip ) is a 1 × p vector indicating TAZ i's interdependency structure. Te selection of variable Z needs to have major infuence on the interdependency structure among TAZs while having correlation with dependent variable Y * . As stated above, population density (pop_den) is chosen as the endogenous variable, so p � 1.
Te endogenous variable Z in (8) is infuenced by its explanatory variable X which is an n × k 2 matrix consisting of k 2 explanatory variables. β is k 2 × 1 the vector of parameters which corresponds to X. For this paper, the explanatory variables include road length (road_length) and bus stop density (bus_den).
Δ � (δ 1 ′ , δ 2 ′ , . . . , δ n ′ ) ′ is an n × p error term vector of (8). Te endogeneity occurs when the error terms Ε and Δ are correlated. Assuming ε it and δ it meet i.i.d conditions and follow joint normal distribution across all i's and j's with mean 0 and variance-covariance matrix V, the jointly normal distribution can be written as follows: or

Model Estimation. Te model estimation uses a
Bayesian Markov Chain Monte Carlo (MCMC) method to estimate parameters. Te MCMC method decomposes a set of parameters of a complex model into a sequence of sublayers, addressing one parameter each time [55]. It updates one parameter each time and uses it for the next sampling process.
Next, we substitute (10) into equation (5) and to derive the likelihood function which can be written as where η � V −1 δ σ εδ is a p × 1 vector. Error term ξ ∼ N(0, σ 2 ξ I n ) is independent of Y * and follows the normal distribution with mean 0 and variance σ 2 To write all parameters for both independent variable and endogenous variable together as β * � (β, β) , the conditional likelihood function can be generally written as where S n (ρ, c) � I n − ρW(c) and Η n � (I − ρW)Y * − Xβ − (Z-Xβ)η. Based on the likelihood function above, the posterior distributions for ρ, γ, β * , V are derived and presented in Table 3.
More detailed derivation process of these parameters' posterior distribution functions can be found in Zhou's work [56].
Te conditional posterior of the i th element of dependent variable y * i actually follows a truncated normal distribution below: where N i (S −1 n H n , σ 2 ξ (S n ′ S n ) − 1 ) refers to the i th element of the multivariate normal distribution N(S −1 n H n , σ 2 ξ (S n ′ S n ) − 1 ).

Results
To estimate parameter values, the SARBP-EWM model was run using MATLAB for 10,000 iterations. β * and V are sampled using the MCMC method. ρ and c are sampled using the MH method since they do not follow standard distributions. Te frst 6000 iterations are set as "burnt-in" to allow parameters to gradually converge to true values. Terefore, the last 4000 iterations are used to obtain coeffcient estimates. In addition to the SARBP-EWM model, a binary probit model is also run to make comparison. Te posterior distributions of all parameters of SARBP-EWM model on weekday are illustrated in Figure 10. Te estimation results for both weekday and weekend and binary probit model are summarized in Table 4. Te model is well estimated in general with most parameters following their expected posterior distributions in Table 3. According to Figure 10, posterior distributions for facility variables β and transit accessibility variables β suggest that they follow normal distributions, which meets with their expected posterior distributions. Te elements in covariance matrix V reveal inverse Wishart distributions, which are consistent with their derived distributions in Table 3.
However, spatial autocorrelation coefcient ρ reveals two major peaks with mean value around 0.10 according to Figure 10(a). Also, indicator variable coefcients c 0 , c 1 in Figures 10(b) and 10(c) show that travel time (travel_time) coefcient c 0 has mean value around 0.08 while population density (pop_den) coefcient c 1 has one major peak and few local minimums and maximums, which leads to mean value around 0.27. Te distributions of ρ, c follow nonstandard distributions as their derived posterior distributions in Table 3 do not follow standard distributions. Te distribution of coefcients β and β is illustrated in Figures 10(d)-10(m), which follow standard multivariate normal distributions. Also, elements of covariance matrix V are presented in Figures 10(n) and 10(o). Te impacts of variables are presented in Table 4 and are discussed in detail as follows.

Spatial Efects.
Te spatial autocorrelation coefcient is estimated to be signifcantly positive in both weekday and weekday using the SARBP-EWM model, which suggests that there is strong spatial autocorrelation among TAZs regarding travel fow diferences between morning peak and evening peak. Te signifcant travel fow diferences in one TAZ tend to spread to surrounding TAZs, and such spatial spillover efect cannot be neglected. Such phenomenon is   also confrmed in previous research studies where the change of travel fow diferences in a TAZ will have positive impact on the travel fow diferences of other TAZs [57].

Indicator Variables and Endogeneity.
Te SARBP-EWM model contains indicator variables which defne the endogenous weight matrix, which is jointly defned by travel time (travel_time) and population density (pop_den) according to (7). Population density is estimated to be signifcantly positive in both weekday and weekend, indicating that it is a good indicator variable for defning the endogenous weight matrix.
In fact, population density is an important indicator of urbanization level of an area. Te population density diference measures the urbanization level diference between OD pairs, which defnes the relative weight between OD pairs. Terefore, population density is selected to represent the interdependency structure between OD pairs. Te endogeneity among TAZs is measured by the term σ εδ in covariance matrix V, which represents the errors and their covariance between (5) and (8). Te covariance σ εδ is estimated to be insignifcant with mean value of −0.70 on weekday and being signifcant with mean value of -0.98 at weekend, indicating that there is signifcant endogeneity among TAZs at weekend. Such endogeneity is neglected in traditional spatial models, which may lead to biased estimation and misleading conclusions. It suggests that there is negative convolution between travel fow diferences and population density. With the increase of population density, travel fow diferences tend to decrease as there is balanced infow and outfow travel fow with the regional development. However, when the TAZ is at initial development stage with low population density, there could be great travel fow diferences between morning peak and evening peak.
For the error term V δ in (8), it shows positive and signifcant correlation with endogenous variable population density in both weekday and weekend, indicating that the unobserved variable in (8) is strongly correlated with endogenous variable. Tis may be because current dataset does not provide other variable information such as car ownership or per capital income, which may help explain the endogenous variable. However, it can be understood that this is by far the best result we can get based on current dataset.

Endogenous Weight Matrix.
Te SARBP-EWM model has the distinct feature to allow an endogenous weight matrix, which indicates the relative weights of infuence between TAZ pairs. For this paper, such infuence means the travel fow diferences and the surrounding afected TAZs, similar to the "peak spreading" phenomenon. To further illustrate and compare the relative weights among TAZ pairs, weight elements of all TAZ pairs on weekday and weekend are categorized and presented in Figures 11 and 12, respectively.
According to Figure 11, most TAZs have relatively low spatial weights in weekday as indicated by 987 light red blocks with spatial weights 0.001-0.060, taking up 96.4% of all TAZ pairs, while TAZ pairs with high spatial weights consist of 23 TAZ pairs in 0.041-0.060 range (red color) and 14 TAZ pairs in 0.080 and above range (dark red color), together taking up 3.6% of all TAZ pairs. Figure 11 suggests that TAZs with high spatial weights tend to cluster, which means that they tend to spread their travel diferences to surrounding TAZs, similar to the "peak spreading" phenomenon. Intuitively, such fow imbalance comes from surround TAZs and cannot be instantly diminished. Weight element distribution in weekend shows similar patterns. Besides, the clustered blocks are mostly identical to those in weekday, indicating that these blocks easily reveal signifcant travel diferences, whether being weekday or weekend. To be specifc, 984 TAZs have relatively low spatial weights, taking up 96.1% of all TAZ pairs. Also, TAZ pairs with high spatial weights consist of 24 TAZ pairs in 0.041-0.060 range (red color) and 16 TAZ pairs in 0.080 and above range (dark red color), together taking up 3.9% of all TAZ pairs. TAZ pairs with high spatial weights also tend to cluster in weekend. Previous research studies have confrmed such fnding where the change of travel fow diferences in an TAZ will have positive impact on the travel fow diferences of other TAZs [22,57,58].
To further illustrate the endogenous weight matrix, spatial distributions of weight elements for typical TAZs are presented in Figure 13. Te results indicate that TAZ 7 has high spatial weights with TAZ 1, TAZ 4, TAZ 5, TAZ 14, and TAZ 22 in weekday. It suggests that urban TAZ has strong impacts on travel fow diferences of surrounding and relevant TAZs. Similarly, TAZ 8 reveals high spatial weights with TAZ 2 and TAZ 3 in the city south direction and TAZ 13 and TAZ 15 in city southeast direction in weekend, indicating that urban TAZ has strong impacts on travel fow diferences of surrounding and relevant TAZs. Both surrounding and relevant TAZs with high spatial weights tend to cluster in both weekday and weekend.

Efects of Explanatory Variables.
Facility variables like the number of ofces, supermarkets, middle schools, clinics, inns, and gymnasiums in TAZs may have major impacts on the travel fow diferences between morning peak and evening peak. Estimation results using the SARBP-EWM model show that facility variables have signifcant impacts on travel fow diferences on both weekday and weekend in general. Impacts on travel fow fuctuations are consistent on both weekend and weekday. Te specifc impact of each facility variable is described as follows: (a) Te number of ofces (ofce) in TAZs is estimated to be signifcantly negative with dependent variable on both weekday and weekend. It suggests that a higher number of ofces in origin TAZs are associated with lower travel fow diferences between peak hours. However, in previous studies, ofces are usually thought as major sources of travel demand [39,43,59]. However, since this paper focuses on the travel fow diferences, ofces tend to generate large but relatively equal commuting travel demand as people need to go to work in the morning peak and leave ofce in evening peak. Te equally large  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  commuting travel fow actually does not contribute much to the travel fow diferences between peak hours. Besides, in urbanized TAZs with more ofce buildings, the commuting travel fow takes large portion of total travel fow, thus stabilizing the fuctuation level of travel fow between peak hours. Such result is consistent with result in the binary probit model.  Terefore, supermarket may generate equal travel fows in the morning and in the evening, thus contributing little to the travel diferences between morning peak and evening peak. (c) Te number of middle schools (mid_school) in TAZs shows signifcant and positive relation with dependent variable in both SARBP-EWM model and binary probit model across weekday and weekend, which indicates that the change of number of middle schools in TAZs has major impact on the travel fow diferences between morning peak and evening peak. Intuitively, schools are major travel fow sources as students need to go to school in the morning and leave school in the evening during weekdays [46]. However, the number of schools also has major impacts on travel fow diferences at weekend because students take "weekend" classes at school, which is common in less urbanized cities like Xuancheng as students choose to take "weekend" classes to enhance their studies to get better exam grades. Terefore, the number of schools in TAZs has major impacts on the travel fow diferences on both weekday and weekends. (d) Te number of clinics (clinic) shows signifcant and negative relation with dependent variable in both weekday and weekend. It means that a higher number of clinics are associated with lower travel fow diferences between peak hours. In fact, hospital's impact on travel demand varies with time and space according to previous studies [20,47]. Results indicate that clinics in Xuancheng have negative impacts on travel fow diferences. It may be because people are busy working or relaxing during daytime while they are more fexible to visit clinics during evening peak hours. (e) Te number of inns (inn) reveals insignifcant relation with dependent variable in both weekday and weekend. It indicates that the number of inns has little infuence on travel fow diferences between peak hours. It is possible that for small cities like Xuancheng in China, inns are not the major travel fow sources that would not afect travel fow diferences. (f ) Te number of sports centers (sports) shows insignifcant relation with dependent variable in both weekday and weekend. It suggests that the number of sports center has no major impact on travel fow diferences between peak hours. In practice, sports centers are not major sources of travel fows. People go to sports centers occasionally for sports activities. Terefore, sports centers have little infuence on the travel fow diferences.

Efects of Indicator Variables for Endogenous Variable.
Te endogenous variable population density in (8) is also afected by explanatory variables like road density and bus stop density. Te road density (road_density) in TAZs reveals signifcant and positive correlations with population density in both weekday and weekend using the SARBP-EWM model, suggesting that higher road density is associated with higher population density. Usually, areas with higher urbanization level tend to have both higher population density and higher road density, as urbanized areas would attract more residents and have more roads to accommodate frequent travel demands. Another important factor that may afect population density is the transit accessibility. Te results show that bus stop density (bus_den) is insignifcantly associated with population density. In practice, bus stop density in small cities like Xuancheng may not have major impacts on population density.
In general, the SARBP-EWM model successfully identifed signifcant spatial efects and endogeneity. It reveals the hidden infuential factors which are not discovered by the traditional binary probit model and quantifes their impacts on travel fow diferences in both weekday and weekend. Facility variables like number of ofces, middle schools, and clinics have major impacts on travel fow diferences. Besides, road density shows signifcant relation with endogenous variable population density.

Discussion of Policy Implications
Xuancheng has released the "Xuancheng 14th Five-Year Plan for National Economic and Social Development and the Long-Range Objectives through the Year 2035 (X145Plan)" on May, 2021 [60], which sets detailed targets for city's future development. X145Plan proposes to promote industrial development, enhance public service infrastructures, and build more advanced transportation systems. Tese policies have potential impacts on travel fow diferences based on results above. Terefore, it is necessary to analyze how these policies would afect the travel fow diferences in diferent TAZs.
According to X145Plan, Xuancheng plans to promote industrial platform construction from several aspects. First, the X145Plan proposes to enhance Xuancheng economic and technical development zone (TAZ 28) by focusing on renewable energy, equipment manufacturing, food and drug, electronic information, etc. Second, the X145Plan aims to develop Xuancheng Modern Service Industrial Park (TAZ 24) by promoting digital economy, logistics, agricultural products, etc. [60]. Tese industrial promotion policies would lead to the emergence of many ofce buildings, which would decrease local travel fow diferences based on results in Table 4. Terefore, industrial promotion policies should be advocated as they enhance economy while decreasing travel fow diferences. However, decreasing travel fow diferences do not mean decreasing absolute trafc volume. City planners and policy makers still need to be cautious not to add too much trafc to the city.
In addition, Xuancheng aims to build multilevel consumption platforms to facilitate consumption goods circulation according to X145Plan. Te X145Plan proposed to build new featured streets including Huchengfang (TAZ 14) and Doufuxiang (TAZ 16), enhance business circles  30), and so on [60]. Middle schools have positive impacts on travel fow diferences according to results above. However, hospitals have negative impacts on travel fow diferences. Also, sports facilities have little impact on travel fow diferences. Terefore, the total impacts of these public service infrastructures are ambiguous. For city planners and decision makers, they need to carefully evaluate possible impacts on travel fow diferences under diferent policies.
Terefore, to evaluate impacts on travel fow diferences of major policies of X145Plan, the typical directly and indirectly afected TAZs based on estimation results are summarized in Table 5.
Te directly afected TAZs are obtained from X145Plan, and impacts are derived based on estimation results. In addition to these directly afected TAZs, some TAZs would be indirectly afected spatially according to the estimated endogenous weight matrix. For example, TAZ 11 is directly afected by building No. 3 Middle School extension and would spatially afect travel fow diferences on nearby TAZ 16, TAZ 17, TAZ 18, and TAZ 19 on both weekday and weekend. However, for some areas such as TAZ 24, building ofces and consumption platform would cause negative and ambiguous impacts separately. Terefore, the total impacts on spatially afected TAZs are ambiguous. Te above results provide detailed policy implications based on X145Plan, which would facilitate urban planning and policy making.
For transportation-related policies, X145Plan proposes to build more advanced transportation network by "connecting each county with roads" [60]. Terefore, more roads would be built in TAZs and between TAZs. According to estimation results, an increase in road density is associated with higher population density, which indicates that TAZs with higher road density tend to have higher population density. In fact, for minor cities like Xuancheng, TAZs with higher road density are mainly urban TAZs with more developed facilities and services, which tend to attract more residents. Tis provides important policy implications for urban planners to build more roads to help boost economy and attract population. Tis caters to the traditional Chinese saying that "Building the road is the frst step to become rich."

Conclusions
Tis paper investigates the travel fow diferences between morning and evening peaks based on AVI data in Xuancheng, China. A spatial model with endogenous weight matrix is established to investigate infuential factors considering the endogeneity issue. Te results confrmed strong spatial efects and endogeneity among TAZs. As for infuential factors, number of ofces and number of clinics are found to have negative relation with the travel fow diferences on both weekday and weekend, while the number of middle schools shows strong positive relation with dependent variable. In addition, the spatial weight matrices for both weekday and weekend are estimated and compared. Spatial weights in weekday tend to cluster with lower weights while they are randomly distributed with higher weights in weekend. Based on the results, policy recommendations are evaluated and proposed.
Te main contributions of this paper are summarized as follows. (a) Tis paper utilizes AVI data to investigate the travel fow diferences between peak hours on both weekday and weekend. Te results confrm strong spatial correlations among TAZ pairs on both weekday and weekend. (b) Endogeneity among TAZs is considered and quantifed. Tis paper is among the few studies considering endogeneity by applying a spatial model with an endogenous weight matrix. "+" means increased travel fow diferences; "−" means decreased travel fow diferences; "NA" means not available; "?" means ambiguous impacts on travel fow diferences.
Te endogenous weight matrix is successfully estimated. (c) Tis paper quantifes impacts of key factors on travel fow diferences. Te results suggest that facility variables such as number of ofces, supermarkets, middle schools, and clinics have major infuence on the travel fow diferences. (d) Tis paper provides major policy implications based on X145Plan. Policies on enhancing industrial development, building more schools, and improving medical services are evaluated, and afected TAZs are identifed. Future work mainly includes two aspects. First, the travel fow fuctuation threshold can be determined using more systematic approaches. Tis paper chooses fuctuation threshold based on its approximate cumulative distribution. It can be further elaborated to be able to judge signifcant travel fow diferences while tolerating random variations. Second, more efcient algorithms could be developed to obtain more accurate OD fow based on AVI data. Tis paper reconstructs vehicle trajectories based on AVI data from intersections, which still leave much uncertainty on vehicle trajectories. More efcient trajectory reconstruction techniques could be adopted to recover precise vehicle trajectories to provide more accurate data for analysis.

Data Availability
Te data are not publicly available due to privacy or ethical restrictions but are available from the corresponding author on reasonable request.

Disclosure
Tis study was performed at the Sun Yat-sen University (SYSU) Research Center of ITS in the context of the collaboration with the Joint Research and Development Laboratory of Smart Policing in Xuancheng Public Security.