Local or Neighborhood? Examining the Relationship between Traffic Accidents and Land Use Using a Gradient Boosting Machine Learning Method: The Case of Suzhou Industrial Park, China

In cities, road traﬃc accidents are critical endangerment to people’s safety. A vast number of studies which are designed to understand these accidents’ leading causes and mechanisms exist. The widely held view is that emerging analysis methods can be a critical tool for understanding the complex interactions between land use and urban transportation. Using a case study of Suzhou Industrial Park (SIP) in Suzhou, China, this paper examines the relationship between diﬀerent land use types and traﬃc accidents using a gradient boosting model (GBM) machine learning method. The results show that the GBM can be used as an eﬀective accident model for a variety of research and analysis methods by (1) ranking the inﬂuential factors, (2) testing the degree of interpretation of each variable as the complexity of iterations changes, and (3) obtaining partial dependence plots, among other methods. The ﬁndings of this study also suggest that land use types—including facility points—demonstrate diﬀering degrees of inﬂuence at two geographical scales: local level and neighborhood level. In the ranking of relative importance at both scales, the variables of education institutions, traﬃc lights, and service institutions are all ranked high—with a more signiﬁcant inﬂuence on the occurrence of accidents. However, residential land and land use mix variables diﬀered signiﬁcantly in both scales and showed a signiﬁcant deviation compared to the other results. When adjusting the complexity of the decision tree, the local level is more suitable for measuring variables such as residential areas and green parks where pedestrians and vehicles have ﬁxed mobility periods and moderate ﬂows. On the contrary, the nearest neighborhood level is more suitable to a small number of variables related to public service facilities at ﬁxed locations, such as traﬃc lights and bus stops. In the partial dependence plots, all variables, except educational institutions and residences, show a positive correlation for accidents in the ﬁtting process. The results of this study can ideally help inform transportation planners to reconsider transport accident occurrence rates in the context of the proximity to various land use types and public service facilities.


Introduction
Traffic safety is a crucial issue affecting the quality of urban residential life. According to global statistics from the World Health Organization (WHO), around 3,700 people die per day due to road traffic collisions, and tens of millions suffer related injuries each year [1]. China has one of the highest rates of traffic accidents in the world, with more than 260 thousand fatalities annually. e WHO's 2015 global status report on road safety [2] indicates that 18.2 deaths per every 10,000 people occur in China, a statistic which also reflects the world average. However, China's rate is higher than the rest of the Western Pacific region's average of 16.9 deaths per every 10,000 people [2], and it only falls below Southeast Asia and Africa in the six major regions designated by the WHO (see Figure 1).
Traffic accidents threaten people's lives, in addition to generating substantial economic losses. In general, traffic accidents involve the subjective actor (the driver) and the objective environment (vehicles and roads). Exploring the causes and mechanisms of traffic accidents within this dynamic will help reduce their overall risk.
Earlier studies have already identified these specific influencing factors: (1) e natural environment: cloudy or rainy weather, temperature, humidity, and visibility are proven to be related to traffic accidents [3,4]; lousy weather (rain/fog/snow) also has a significant positive correlation with accidents [5]. (2) Road conditions: greater complexity of the road environment, including a number of intersections, road network density, and a number of vehicles, is likely to create more potential risks of an accident [6,7].
(3) Human conditions: high employment and population densities in a given site lead to an increase in traffic flow and consequently increase accidents [8][9][10]; accordingly, sparsely populated areas have lower accident rates [11]. Age demographics and education levels will also affect the times and frequencies of people going out, thus influencing the conditions for accidents [11].
(4) e social environment: studies show the relationship between facility accessibility, land use, and traffic accidents. For example, industrial land, commercial land, and land mix are all positively correlated with traffic accidents [5]. Educational land use areas report varied results in the literature. For instance, evidence shows that educational land use has a significant impact on accidents [4], while a study found that education land use has the least magnitude among other factors on accidents [12]. Natural land use corresponds to the highest safety level, demonstrating the fewest risks [8,13]. However, Zou et al. [14] uses truck crash severity data in New York City to examine whether traffic accidents are caused by land use patterns rather than land types. He points out that, in one case, both service employment and recreational employment occupy high-density land, but fewer traffic accidents occur in the service employment region. His study also demonstrates that this is subject to change in any given environment and becomes more evident in the recreational employment region. is illustrates the need to take more detailed land use into account when considering accident conditions.
Earlier research on traffic accidents can also be divided into micro-and macroscales [15,16]. e former focuses on the road itself, such as crossroads or intersections [7,13,17] and highways [18,19]; a closer look examines road length and width, vehicle speeds, and traffic flow, to list a few, as tools to optimize the road structure. On the macrolevel, most census tracts [4,20,21], traffic analysis zones (TAZs) [8,[22][23][24], and living communities [24] are used to identify social and economic factors (such as population and land use) that illustrate the spatial agglomeration of accidents.
Looking from different scales also tends to indicate diverse accident outcomes. Huang et al. [25] point out that detecting road facility as hotspots is more accurately an analysis tool than observing their entire encompassing region. Some crossroad traffic accident studies delineate the scales of 15 m, 60 m, and 75 m, respectively, and correspondingly reach different conclusions [26,27]. Some results indicate that collisions are more likely to occur at distances of 100 to 200 ft. from intersections, while some of the experimental results are smaller, below 50 ft. Yu and Zhu [28] found that creating buffer zones around schools (with distances of 0.5, 1, 1.5, and 2 miles, respectively) will each impact security of the school zone differently. In this way, they demonstrate the tangible biases that examining land use at different scales present, given that every scale will incorporate a different range of influencing factors. Nevertheless, autocorrelation and heterogeneity of spatial effects must also be considered; regardless of scale, geographic units with higher internal similarities will achieve more stable statistical results.
Among these commonly used research scales, TAZ is the only regional system associated with transportation. Compared with larger geographic scales, TAZ has better internal similarities in land use, road network, and traffic operation. In light of comparably smaller geographic scales, TAZ would link traffic data to produce more evident socioeconomic characteristics. e scale at which TAZ operates is also easy to integrate with the transportation planning process and is therefore used as a local level research scale in this paper.
In addition to discussing the scale of TAZ, this paper is going to address other research scales. In western literature, accident research in geographic scales also includes local areas [29], counties [16,30,31], and regions [32]. In the Chinese context, although the study of land form is often divided into administrative regions such as provinces and cities [33,34] or terrain areas such as plateaus and hills [35][36][37], a consistently defined scope at which road network patterns are observed to impact traffic safety remains neglected in the literature. erefore, it is necessary to explore different research scales that are more suitable for site characteristics and data. is research is a contribution to help fill this gap.
is paper attempts to utilize a scale that is relatively homologous to that of the TAZ model. e center of each TAZ is used as the centroid to generate iessen polygons using ArcGIS, thus avoiding the problem of missing or duplicated study areas that can easily be caused by buffers. e iessen polygons are irregularly shaped polygons with varying areas based on the centroid, and they are more spatially homogeneous than administrative boundaries. After studying collision models with different spatial units such as census tracts, state electoral divisions, developed grid cells, and natural area boundaries, some scholars have recommended iessen polygons because of their higher spatial performance [38][39][40]. e iessen polygon is, therefore, chosen as the research scale of the nearest neighborhood level apart from the TAZ area. e different study scales may inadvertently create a modifiable areal unit problem (MAUP) or the issue of changing statistical properties due to differences in areal units. However, since both selected scales build on existing TAZ areas and do not involve adjustments in basic spatial units such as census tracts or urban structure based on major roadways, the changes in the statistical results should be modest at most [40]. e results from the two study scales (the local level of the TAZ area and the nearest neighborhood level bounded by the Tyson polygon) would then be comparatively analyzed.
Most of the methods employed in early quantitative studies of traffic accidents focus mainly on singular influencing factors. Gasparini [41] and Li et al. [42] first adopted Markov chain traffic accident statistical models to analyze the time factors of traffic accidents; Kim and Yamashita [43] compared the number of accidents per unit area in different land use and found that commercial geographic entities have the lowest level of traffic safety.
Later studies began to incorporate the generalized linear model (GLM) and used to study the relationship between various influencing factors and the frequency of accidents. e logistic model and the logit model were used multiple times in US studies to analyze accident severity [3,18,29,44]; Kim et al. [45] and Dissanayake et al. [46] employed Poisson regression and negative binomial regression. ey examined the respective relationships between geographic entities such as parks, businesses, schools, and high-density residential buildings with traffic accidents. It is not easy to measure the difference between various geographic units. Adjacent geographic units usually have spatial autocorrelation, yet spatial heterogeneity often occurs when they are far apart. Random parameters and spatial models such as Bayesian space models use spatial autoregressive models (SAMs) and geographically weighted regression (GWR) models to solve this issue via traffic safety spatial analyses [24,47,48].
It can be concluded that earlier research focuses on macroanalysis of accidents with the intent to produce statistics from their data by emphasizing single factors or multiple factors in the process. Since the 1990s, alongside the development of machine learning and the advancement of data mining technology, systematic causation is being studied increasingly often with the application of machine learning models to traffic accidents. Li and Shao [6] use backpropagation (BP) neural networks and the artificial neutral network (ANN) as methods to identify critical causal factors to the severity of injuries in traffic accidents. e neural network method incorporates the occurrence of traffic accidents as an input and output system. Influencing factors such as people, vehicles, roads, and the traffic environment are considered as input layer variables. e number of accidents or fatalities is operated as output layer variables.
rough multiple corrections of parameters, a complex, nonlinear relationship network model between variables is established and more accurate traffic accident analysis results are obtained. Chong et al. [49] proposed testing artificial neural networks and decision trees to model the severity of traffic accident injuries. Experiments using datasets obtained from national automotive sampling systems showed that decision trees outperformed neural networks. Advanced algorithms that have already been applied to the monitoring of sudden traffic events include the probability neural network (PNN) and the support vector machines (SVMs) [50,51]; Al-Ghamdi et al. [3] introduce a mixed model of wavelets transformation and logistic regression to their traffic events testing method. Further research suggests that the causes of traffic accidents are systematic and intrinsically linked. BP and ANN machine learning methods establish nonlinear networks with input variables. While the obtained results from these models are interpreted in terms of causal relationships, the outcome parameters can be compared either within the same dataset [49] or across different models.
Recent big data technology uses data mining and machine learning techniques to calculate traffic data, identify potential risk factors, and assist in offering targeted measures to avoid and prevent traffic accidents. is paper uses a new machine learning method known as the gradient boosting model (GBM) as a novel application to the traffic accident research field. In doing so, this research aims to explore the relationship between complex land use characteristics and Journal of Advanced Transportation 3 traffic accidents. Using the same data source, the scale of the  local level bounded by TAZ and the scale of the nearest  neighborhood level bounded by the iessen polygons are  separately counted to check the power of the model and the  explanatory effect of the variables at the two locational levels. e following section will introduce the mathematical model of the methodology used in this paper.
e third section will then present the data and variables. e fourth section contains the experimental results and discussion and divided between the preparation of the GBM model, the interpretation degree of each variable, the explanatory power within the two scales in the case of a change in parameter "number of trees," and the partial dependence of each variable. Section 5 will presents the conclusions and limitations of this research.

Methodological Review.
is section demonstrates a development from traditional statistical methods to data-driven methods. Mannering et al. [52] point that the choice of analysis method for crash data should take into account the trade-off between prediction capability and the causal nature of factors contributing to accidents. Traditional statistical methods have been relatively easy to use data as it presents accuracy in prediction and rationale in causality. With a big dataset, the data-driven approach should be primarily used. Other researchers suggest that cultivating new methodologies to address unobserved heterogeneity and endogeneity is beneficial for understanding accident determinants [53]. When selecting different methods, in addition to consideration for the dataset, implicit assumptions also need to be made based on the likelihood or severity of the accident [54]. is helps to embody different aspects of the accident mechanism and make more accurate safety decisions. e following is a detailed analysis of the choice of methods in this study.
Statistical models are designed to capture the relationship between independent and dependent variables as accurately as possible. e ordinary least squares (OLSs) method in its simplified form demonstrates a linear relationship, wherein the error term satisfies a normal/Gaussian distribution and satisfies homoscedasticity. Although the errors do not meet the condition of being normally distributed and homoscedastic, the generalized least squares (GLSs) method can use its link function to convert a number of target variables that satisfy a particular distribution condition into a linear model, thereby eliminating heteroscedasticity in linear relations. Weighted least squares (WLSs) can also be used to convert the model to a linear format by weighting the explanatory variables to eliminate their heteroscedasticity.
Despite their potential usefulness, Mannering and Bhat [53] note that simple linear regressions such as OLS, GLS, and WLS are seldom used as a method in accident research. Linear regression methods, in their varied forms, are only applicable to fit hyperplane datasets without using other factors as weights. Traffic accidents are the outcome of interweaving multiple influencing factors, which largely rely on the construction and solution of complex nonlinear problems [53,55]. Linear regression models cannot adapt to capture complex patterns, and it is impractical to add interaction terms or use polynomials. erefore, global regression based on linear models alone is not sufficient for this type of analysis. To address this issue, the GWR method could be used to build spatial models, which use locally weighted regression to enhance the accuracy of the results. It calculates weights by constructing spatial kernel functions and then uses local regression to intuitively reflect the nonstationary characteristics of geographic relationships [56]. However, in reality, the accurate modeling of complex geographic relationships requires increasingly nonstationary solution accuracy and computing power. If GWR is used, the model needs further improvements in proximity analysis, calculation of kernel weights, and optimization of bandwidth parameters, among other areas [38,57].
In addition, various machine learning methods and spatial regression models have been increasingly used in traffic accident research due to their capacity for superfitting to nonlinear problems. Among them, support vector machine (SVM) methods use kernel functions for nonlinear classification [58][59][60]; hierarchical clustering algorithms divide traffic impacts into layers based on data distribution [61]; K-means clustering algorithms and GWR both perform cluster analysis based on the collection distance of sample points [62]; and deep learning is often applied to general graph models or hypergraph models without massive constraints [62], such as image recognition of traffic accidents in social media and black spot recognition in urban traffic safety [63][64][65]. However, the earlier studies present a lack of accuracy due to the errors and unobserved variances.
As this study looks at the impact of each land use type on traffic accidents and its pattern at different spatial levels, regression-and tree-based models are selected to address the complexity of issues and factors involved in accidents. e latter involves drawing multiple trees from top to bottom through multiple terminal nodes to visually represent the detailed effects of each factor in the model in a nested manner [66]. In one of these tree-based regression methods, boosting first builds multiple decision trees by an orderly sampling of the initial training set and then combines the multiple trees to slowly train the prediction model to improve the prediction performance [66]. e gradient boosting method (GBM) is used to implement this boosting technique.
is research identifies GBM as a better method over traditional methods such as generalized linear functions of all kinds since it can use different steps and a few critical parameters to help explain the loss function in the model. is loss function is the same as the rule of finding error patterns in the linear function to help describe the model more accurately. erefore, when the interpretation of the model in some traditional methods is not accurate enough, GBM can learn nonlinear relationships to achieve better accuracy. GBM is also very receptive to outliers and is not sensitive to noisy data; it works to account for missing data while efficiently calculating. Additionally, the bagging algorithm, which also belongs to the tree-based algorithm, shares similarities with the characteristics of GBM. However, bagging uses a self-service sampling method (sampling method with replacement; duplicate samples may be taken) in building a decision-tree, which is less efficient than GBM. It is more suitable for data with fewer dimensions and higher accuracy requirements [62]. e land-type data obtained in this research is complex in its distribution, and the sizes of various types of land use vary greatly and are mixed with each other. erefore, GBM would have a higher accuracy when sampling and is thus selected as the method for use in this research.

GBM Model.
As previously mentioned, the gradient boosting model (GBM) is selected for the machine learning method used in this paper. Boosting algorithms are a commonly used machine learning method that can be applied to classify regression problems. GBM uses boosting to distinguish the strong from among weak classifiers and obtains the new model by training in the direction of gradient descent of the previously modeled loss function. Generally, an important criterion for evaluating the performance of a model is a loss function. e loss function essentially refers to the degree of the model's unreliability. As the loss function decreases, the model becomes more reliable and predictable. e best way to improve the model performance is to make the loss function decline in the direction of the gradient. Given the increasing difficulty in loss function optimization in previous machine learning models, Friedman [67] proposed the following gradient boosting algorithm. It acts as a greedy function approximation method designed to obtain the next model by training in the gradient descent direction of the current model loss function. e following is its mathematical derivation. Firstly, the model is initially set up with ε as the coefficient and h as the assumed classification rule of the overall function F(x): where X � {x 1 , x 2 , ..., x n } represents the independent variables in the input space and Y represents the response variable in the output space. Given a training dataset [50] 1 N , the purpose of which is to find a hypothesis function F * (x) that maps the x function to y, and the difference between this hypothetical function and the real function can be represented by a loss function. e loss function Ψ (y, F(x)) is a nonnegative real-valued function of F * (x) and Y, with the ultimate goal of minimizing the loss function: en, in combination with equation (1), approach F * (x) by a linear expansion in equation (2): where the function h(x; a m ) is a simple classifier with x and a � {a 1 , a 2 ,...} is the parameter in the classifier function. However, the expansion coefficient {ρ m }M 0 and the classifier parameter {a m }M 0 are mainly obtained in the training data using the segment-by-segment training algorithm. e initial hypothetical function F 0 (x) is given first, and then, m � 1, 2, ..., M iterates stepwise as in the following equations: Gradient boosting uses a two-step strategy to solve the loss function Ψ(y, F(x)) in equation (4). e first step is to put the function h(x; a) into the least squares as (6) and get the current pseudoresidual: In the second step, given h(x; a), the optimal value of the coefficient β m is determined by the following formula: is strategy first replaces the difficult optimization problem by the least square method of equation (6), then optimizing loss function Ψ based on a simple parameter in equation (7). e gradient boosting model has achieved rapid development in recent years. Zhao et al. [26] reported a stochastic gradient decision tree based on GBM and constructed a decision tree model with two methods. Elsewhere, an extended end-to-end promotion tree system named XGBoost (extreme gradient boosting) model was proposed by Tianqi Chen in 2016 and has widely been used in image classification and loss estimation since then [68,69].

Relative Importance of Factors.
When predicting the coefficients of the independent variables in the model, it is difficult to rank the coefficients of the independent variables in the model. Moreover, multicollinearity frequently causes interactions between variables in the model, and autocorrelation tends to cause errors. is paper conducts relative weight analysis to solve these problems by sorting the importance of the fit of the model according to each independent variable. It also helps to clarify the multicollinearity between variables [70]. e symbol R i (where i � 1, 2, ..., n) refers to the reliability set of the influencing factors of the entire traffic accident. R (R 1 , R 2 , . . ., R n ) is the first polynomial of the i-th influencing factor and its reliability. If taking the partial derivative with respect to R i (where i � 1, 2, . . ., n), the following equation is obtained: en, I i in equation (8) is the difference between the reliability of the entire set of influencing factors obtained by taking the maximum value 1 and the minimum value 0 in R (R 1 , R 2 , . . ., R n ) except for the influencing factor I, ceteris Journal of Advanced Transportation paribus. I i is the maximum degree of influence of i (where i � 1, 2, . . ., n) on the reliability of the set. With an attempt to compare the relative importance of each factor in the factor set, it is assumed that the reliability of each factor in I i (where i � 1, 2, . . ., n) is r, thus the weighting expression of the relative importance of each factor in the reliability of the factor set is as follows: Equation (9) gives the relative importance of each influencing factor in the reliability of the factor set under different reliability conditions. is method of measuring is applied and discussed by many scholars [71,72]. ey also point out its controversies like large instability, inability to respond to positive or negative correlations, and unclear quantifiers.

Partial Dependence.
Partial dependence changes the value of the target feature while controlling other fixed variables and how the fitting result of the observation model changes. e idea is the marginal effect of variables on the predictions of machine learning models [67]. e estimation method of the partial function is where xs is the feature drawn in the partial dependence graph, while x c is the actual eigenvalue of the feature other than the selected variable. ese two types of features together constitute the feature space x. e assumption of partial dependence is that feature C is not related to other features in dataset S. n refers to the number of instances.
is function represents the effect of the selected explanatory variable and can be used to explain the "black box" model of GBM [73,74]. Partial dependence resolves the issue that the importance of this indicator cannot reflect the positive and negative relationship to a certain extent. In general, the direction of the partial dependence plot reflects the directions of correlation between variables and outcome, whether it is positive or negative. Compared with the earlier GBM models, which could only plot the importance of variables in a ranked bar chart, the newer GBM model has added the function of partial dependence plot. e advantage of this method is that it is intuitive, easy to operate, and can explain causality; however, it can also be interpreted as impractical to show a complete distribution of features at times and assume that the calculated variables are independent of other variables [62].

Study Area and Data Source.
is study selected Suzhou Industrial Park (SIP) as a study area. Established in 1992 and located in the eastern part of Suzhou City, SIP is adjacent to Kunshan City and contains both Jinji Lake and Dushu Lake.
It is located in the east of the Taihu Plain in the Yangtze River Delta. e administrative region covers an area of 278 square kilometers, with a registered population of 0.576 million [75]. SIP has a multilayered transportation system with a dense network of highways, national and provincial highways, railways, waterways, and other transportation networks. In particular, the SIP transportation system is unique, as it traverses numerous waterways, including lakes and rivers. Suzhou Industrial Park is an important cooperation project between the governments of China and Singapore. It draws on the successful experience of advanced countries and regions in its development and management. In the functional land of the industrial park, the central business district is developed around Jinji Lake as the center of the SIP. Within the 80 square-kilometer boundaries of the "China-Singapore Cooperation Zone," four major functions such as business, science and technology innovation, tourism and vacation, high-end manufacturing, and international trade are included. e land use comprehensively covers various types of land use and is relatively conducive to the coordination with transportation compared to other cities in China.
is research uses the SIP's traffic accident data in 2016, with a record of 58,315 traffic accidents, which has been obtained from the SIP traffic police bureau. e spatial join function in ArcMap 10.5 was used to calculate the accident points in the TAZ unit to obtain the accident frequency distribution map (see Figure 2). e accidents are categorized into six levels based on the frequency of occurrence, and the six levels are separated by the color gradient. e map shows the following: (1) the degree of aggregation of the same level is low; (2) each level including the normal peak area of the accident presents a relatively discrete distribution; and (3) accidental peak areas are occurring in the dense TAZ areas, which are distributed in the north and southwest, respectively. In the two areas, the number of accidents in the adjacent TAZs differed significantly. e following data of SIP are used in this study: traffic accident data, different types of land use data (e.g., residential land and educational land), points of interest (POI) (e.g., shopping and leisure places and financial outlets), and road facility data (e.g., traffic lights and intersections). Since the accident data obtained occurred in 2016, the data for land uses and road facilities are also selected for the same period in order to maintain consistency of the study and to explore the causes of accidents more accurately. erefore, the study period is fixed at 2016.

Two Scales of the Analysis Unit.
e analysis in this research involves two spatial units: the local level based on TAZ data and the nearest neighborhood level based on iessen polygons (see Figure 3). Regarding the local level scales, TAZs within the SIP are selected as a spatial unit of analysis, and the number or density of various land use types within the TAZ regions is calculated. e TAZs with smaller areas are deleted because they create outliers and distort the data distribution in the independent variable. In terms of the nearest neighborhood scale, iessen polygons were created to avoid the potential analytical errors caused by overlapping areas that often occur in buffers. e TAZ's centroid is used as the central point to create iessen polygons within the SIP area. ese two scales have similarities concerning general spatial location as they share the same center points and the geographic structure of the analysis unit. However, while TAZ has been used extensively in transportation-related analysis, the neighborhood level was rarely tested because it is not directly tied with regional spatial systems and   erefore, in this research, the nearest neighborhood scale with iessen polygons as an analysis boundary is worth exploring. Table 1 shows the basic descriptive statistics of the spatial units at both scales.

Variables.
e variables used in this research are structured and explained in Table 2. e number of traffic accidents occurring at each of the two geographical levels comprises the dependent variables. ese variables in 12 categories include transportation facilities such as "Intersection," "Trafficlight," and "Busstation"; residential living facilities such as "EduInstitu," "Financial," "Healthcare," and "Government"; and land use mix "D1" and "D2." With particular reference to the work of Yue et al. [15], this research uses neighborhood vibrancy to measure the degree of land mix, using Hill numbers to refer to the multidimensional POI mixed use. D1 and D2 calculate the exponential of the Shannon entropy and Simpson index to measure the diversity of residential, office, and commercial sites, respectively. Moreover, in order to control the density level of the POI points of the land type, the classification and regression tree (CART) method is used to divide individual land-type variable into segments representing different density levels and convert them into dummy variables. After integrating the data from both scales, all density dummy variables are binary for high density and low density. Lowdensity variables are used as references. e real variables representing the land type and the dummy variables controlling the density levels are included in the model of this study.

Model Building.
e GBM model needs to set several parameters, including distribution, n.trees, interaction.depth, weights, n.minobsinnode, shrinkage, train.fraction, cv.folds, keep.data, class.stratify.cv, and n.cores, to name a few. Some of these variables are set selectively, such as weights, n.minobsinnode, and some use the default value rather than setting purposely like n.trees (default is 100), interaction.depth (default is 1), and bag.fraction (default is 0.5). e most important and most frequently trained parameters are shrinkage, N.trees, and cv.folds. Reducing the rate requires more iterations, and it takes longer for larger data [67]. e empirical results have shown that shrinkage coefficients with smaller values (v ≤ 0.1) exhibit better generalization errors. e n.trees is generally used with parameter of shrinkage. Lowering the shrinkage and adding more trees can improve the generalization ability of the model and avoid overfitting [76]. Cv.folds is the judgment method of the model. N-cv.folds go through a total of n experiments and obtain the accuracy index of the measurement algorithm after each test, which is used as an indicator to judge the merits of the algorithm.
Prior to adjusting parameters, important parameters are set to default values because initial default values help determine other parameters. e steps of the adjustment are as follows. Firstly, based on the accident distribution data, the model of Poisson distribution is confirmed. Subsequently, the learning rate is taken to be the original default number of 0.1. ere are some evidences to suggest that a shrinkage rate of 0.001 will bring relatively low deviation [12,77], and this study sets the parameter for shrinkage rate to 0.001. e CV method is used to detect different parameters with Poisson deviance as the representative of the loss. e CV number is set as 10, according to the characteristics of the data in this case. Interaction.depth indicates the integer of the maximum depth of each tree [67]. is parameter and n.minobsinnode are the trade-offs that together determine the performance of the model. If the two parameters are too large, it will easily lead to overfitting, but it leads to underfitting when they are too small. In the case where several other parameters are fixed, the lowest Poisson deviance is adopted to decide the value of interaction.depth is 15. In summary, in order to verify the performance validity and stability of the predictive model, following a series of adjustments and experimental comparisons with other similarity indicators, the final model parameters are as follows in Table 3.

Relative Importance of Explanatory Variables.
Relative importance is the role of the indicated feature in predicting the target response and can be used to visually quantify the contribution of each explanatory variable to the model. It is determined by the frequency of the features used in the segmentation points of the tree. e higher the frequency of use is, the higher the importance of the variable is [67]. e response of the eigenvalues or independent variables at the two scales is predicted according to the selected model parameters. Figure 4 illustrates that the relationship between Poisson deviance and iteration could be used to estimate the effect of the model parameters: both test error and train error decrease when the iteration increases. e model does not appear to display the problem of overfitting. If the data are underfit, signifying that the model learning ability is insufficient, it is also necessary to judge depending on whether or not the deviation of the abnormal value occurs. Figure 5 lists the contribution of all variables and their ranking. Under the existing parameters of the model, all variables have a nonzero contribution, and the tailing is longer.
is means that, under the identified model and existing data, all land uses and POIs impact the distribution of the final accident frequency. is lateral also proves that the parameters of the model are valid.
A total value of 100 is allotted to each variable in both models. e relative influence plots (see Figure 4) show that the real variables representing different land uses and land mixes are ranked higher, while the dummy variables controlling density levels, which do not exceed 1%, are ranked lower. is is the case for both models, which confirms that a high density of land use has little effect on the overall results of the model. e density dummy variables help to reconcile the completeness of the model and the persistence of the  variables, but since their importance values are too low relative to the real variables, they are not analyzed more extensively here. e two-scale models (see Figure 4) have similarities and differences in the order of relative importance. With the relative contribution of 13.16% and 12.84%, "Greenspace.Park" and "Trafficlight" represent the most crucial variables in two types of geographical scales. As is demonstrated in other studies [6,7], the factors most influential to the accident are road facilities such as traffic lights, road width, and distance to intersections. However, greenspace is less frequently identified as a decisive cause of traffic accidents. Subsequent "Servifacilities" and "EduInstitu" are also ranked as the second most important variables of the two models at 10.14% and 10.96%, respectively. e two models turn a very vital commonality up: the three  variables of "Intersection," "Shopping.Leisure," and "Government" have a very steady order of importance in the two models. is is a feasible locus as the data sources of the two models are the same. is sort of alignment is consistent with some initial results [78,79] that several types of land use are ranked higher in accident studies. e "Green-space&Park" variable that appears afterward on the nearest neighborhood level model is considered an aberration since it contributes 13.16% in the local level model, which sorted the first but ranks last in the neighborhood level model with only 3.81%. e gap between the two is broad. Also, identified as aberrations are the "Residential" and "D2" variables. ese two variables are ranked very differently at either scale.
e contribution disparity is about 16%. "Greenspace&Park," "Residential," and "D2" are the three variables identified as aberrations because they are different in the rank order between the two scales, when compared to the results of other studies that also analyze their relevant importance (Ding et al. [12] and Saha et al. [79]), revealing that "D1" and "Greenspace&Park" are in the middle and rear positions, respectively, as the results in the nearest neighbor level in this study. However, the importance of "residential" variable is ranked lower than "D1" and "Greenspace&Park." is suggests that the model fits more accurately at the nearest neighbor level for the nonpoint land types. e variables "Servifacilities," "EduInstitu," "Financial," and "Healthcare" are very close in value although disparate in ranking at two scales. eir coefficients differ by 2% in both models.
In sum, the degree of explanation of the continuous variables obtained by calculating the density is higher than that of the dummy variables. Besides, under different models, the order of importance of factors is not exactly the same, and three aberrations appear.
is shows that the distinction in the locational levels allows the model to establish utterly different utility functions during the construction and fitting phases. However, there are also some reasons that the autocorrelation leads to aberrations, and the parameters are underfitting. Some of the nonpoint land types in this study, such as "D1" and "Green-space&Park" variables, ranked consistently with other studies. e degree of interpretation of the two geographical scale models is slightly inconsistent. "Servifacilities," "EduInstitu," "Financial," and "Healthcare" contribute a higher proportion of variables in the local-scale model than their nearest neighborhood scale model, indicating that the established parameters provide a more accurate description of the local-scale model. ese results suggest that the performance of the variables at different scales and in specific land-use types may be explored further. It is also important to note that the dummy variables need to be introduced with care when applying such approaches because of their low degree of interpretation.

Change in Influencing Factors of Real Variables due to
Increasing n.trees. As described in the previous section, the number of trees has the same effect as the number of iterations. As a rule of thumb, the number of iterative regression trees is generally set to a larger number because the "gbm.perf" parameter in "gbm" package can estimate the suitable number of iterations for prediction after the model is trained [80]. An increase in tree complexity would help improve the prediction bias and reduce the learning rate. In this study, the parameters of n.trees are not set especially when the model is determined, and the default number of this parameter of 100 is employed. Nevertheless, when the model is tested with a large tree value, the error of each variable is large. In this circumstance, the results of the setting of a number of trees by Ding et al. [12] are referenced. In his experiment, when the tree complexity is low, the explanatory degree of the variable is low, but the explanatory Journal of Advanced Transportation degree is stable as the tree complexity increases to eight. us, this study tests the changes in the interpretation of n.trees from 1 to 30 and compares the two levels (see Figures 6 and 7). e following shows comparisons of all variables representing land types and land mix between variables and at different scales.

Comparison between Variables.
After adjusting the parameters, the line of the explanatory degree of the variable at each point is compared with the default parameter of the model. It is discernible that as the complexity of the tree increases, the degree of interpretation of all variables rises and falls around the default value of 100 in the range of trees from 1 to 30. ere is no complete detachment although no variable goes steady after any number of trees 1 to 30. Variables have different amplitudes. Using the variables' sum of squares for error (SSE) to measure the magnitude of the shock (see Table 4), it is necessary to order the SSE numbers from small to large: Government < Greenspace.Park < Shopping.Leisure < Residential < D2<Busstation < Intersection < Financial < Traf ficlight < D1<Servifacilities < EduInstitu < Healthcare. e statistical results of the SSE (Table 4) show that the SSE of all variables is at a small value, which indicates that the degree of interpretation of all variables does not fluctuate much as the number of trees increases. e parameters of the model are then fit to the data of this study. Except for "Servifacilities" and "EduInstitu," which reached 5%, the peaks and bottoms of most variables only differ by about 3%. erefore, the number of SSE necessarily represents the smoothness of the curve from the observation (see Figure 6). However, the size ranking of SSE also shows some irrational results, with the transit facility variables "Busstation" and "Trafficlight" located in the moderate SSE values; "Greenspace&Park," "Shopping&Leisure," "Residential," "Servifacilities," "EduInstitu," and "Healthcare," which all belong to the dense POI points, are distributed at the larger and smaller ends of the SSE. In the urban area, "Busstation" and "Trafficlight" are factors with a fixed position and an accurate number and demonstrate high stability. Even if the model parameters are changed, parallel impacts on such variables are small. However, "Green-space&Park," "Shopping&Leisure," "Residential," "Servifacilities," "EduInstitu," and "Healthcare"-factors which are numerous, high in level, and widely distributed-are similar facilities that reflect the daily life of residents. When such factors are used to adjust the parameters involving the complexity of the tree, the final fitting effects for the model are unstable to generate two types of small and large fluctuations.
In general, the complexity of the tree can identify and detect the utility of various variables in the model to a certain extent, which helps to improve the accuracy of the model fitting. However, in this study, the fitting of the model is not accurate enough as the number of iterations of the test is low. Although the interpretation of each variable still fluctuates within a reasonable range of the variable, the results of the model do not reach a consistent state under a certain number of iterations.

Comparison of Two Geographical Levels.
If the statistical methods and models are identical, the partial dependence plots show that the variables' likelihood to occur is different at two levels. Suppose the degree of explanatory variables is adopted to represent the capacity of the locational scales. In that case, the following three types of situations can be divided into three categories shown in Figure 6: (1) For the variables of "Residential" and "Green-space&Park," the blue curve representing the local level is above the orange curve of the neighborhood level, indicating that the capacity of the local level is high for these five variables rather than that of the nearest neighborhood level. It can be inferred that these five variables are more responsive to the TAZ region. (2) Inversely, for "Trafficlight," "Busstation," "Healthcare," and "D2," the capacity of the nearest neighborhood level is higher than that of the local level.
Notably, the neighborhood level is aggregated so that these four variables exceed the others. (3) When it comes to the variables of "Shop-ping&Leisure," "Financial," "Government," "Intersection," "Servifacilities," "D1," and "EduInstitu," the two lines are repeated or interlaced, so it is hard to judge the suitability of either scale.
e "cale effect" in this research highlights several critical arguments. For instance, the research findings indicate a severe scale dependence on the study of model evaluation of traffic accidents. ese analysis results of the local level bounded by TAZ must be interpreted with caution because there are only two variables "Residential" and "Green-space&Park" that are available to demonstrate a higher importance at the local level scale. e other variables either show higher importance on the nearest neighborhood level or are mixed between the two scales. "Residential" and "Greenspace&Park" indicate the places where residents live and spend the most time daily. e mobility of pedestrians and vehicles is high at fixed times, such as during peak commuting hours and weekends. In contrast, among the variables with a high degree of explanation at the nearest neighborhood level, "Trafficlight" and "Busstation" are more about the infrastructure and public services invested and managed by the government. e spatially homogeneous character of the iessen polygon can be more evident in such sites. "Financial" and "Shopping&Leisure" have more attributes that are greatly affected by the surrounding environment.
ere may be dynamic changes in location. Hence, they showed a kind of crosscomplication in the two scales. e spatial heterogeneity and sensitivity in such sites will be reduced. Indeed, the accident effects of the two variables are not evidently different in both scales. erefore, the nearest neighborhood level is more powerful than the local level due to its comprehensive description of variables as a byproduct of its larger sample sizes.  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Local level Nearest neighborhood level  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 (d)   4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Local level Nearest neighborhood level  4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29     dummy variables as the number of trees increases is shown in Figure 7, wherein density dummy variables exhibit two distinct characteristics. e first characteristic involves their influencing factor lines tending to fluctuate more pointedly relative to the real variables. A possible explanation for this might be because their maximum degree of interpretation does not exceed 1%, yet their minimum value often reaches 0 in individual trees. erefore, the visualization displays more dramatic fluctuations. e second characteristic is that the influencing factor line of some variables coincides with the x-axis and is constantly equal to 0. is means that the variables do not show any explanatory power in the model. "Greenspace&Park.High" and "Residential.High" show this at the nearest neighborhood scale; this also occurs in "EduInstitu.High," "Financial.High," "Green-space&Park.High," and "Servifacilities.High" at the local scale. In this case, even though some explanatory power exists for these kinds of variables at the other scale, it is impossible to compare the results of the two scales to obtain real meaning. Relative to the real variables, the density dummy variables offer lower explanatory power consistently as the number of trees grows and varies irregularly, with limited practical utility for the model.

Partial Dependence Effect of Variables.
e partial dependence plot is used below to describe the relationship between land use, road variables, and accident frequency. e partial dependence plot is a summary of the changes in all variables under the same conditions. e partial dependence plot cannot evaluate the statistical model directly, but it shows how the variation in the independent variables affects the process of model fitting [67]. Among all the variables, the partial dependence plots represent real variables of road structure facilities and land mixture (shown in Figure 8).
Road structure facilities in Figure 8 include three variables: "Trafficlight," "Intersection," and "Busstation." As discussed earlier, road structure facilities are usually defined as influential factors correlated with traffic accidents. Two variables fit the accident model to a stable stage when the coefficient value is small, even under different levels of scale: when there are more than 25 traffic lights per square kilometer, their influence on traffic accidents tends to be stable; similarly, the degree of interpretation does not change after over 15 bus stations per square kilometer. A particular case was observed when the intersection coefficient reaches a considerable value before the traffic accident stabilizes. e data reported here appear to support that this case occurs around the value of 400 at both levels.
However, as the density of transportation facilities increases, specific differences arise in the development trend per each level. Where traffic lights are sparse, the number of traffic accidents tends to increase sharply. e number of accidents declined briefly after 10 traffic lights per square kilometer and then settled at a value after 20 traffic lights per square kilometer, both at the local and the nearest neighborhood levels. e "Busstation" variable fluctuated several times at a density of less than 15 bus stops per square kilometer and stabilizes after that. e "Intersection" variable, on the other hand, fixed at the densities of around 400 after initially jolting upward and falling immediately after that. For both the "Busstation" and "Intersection" variables, the local and nearest neighborhood levels revealed very similar fluctuation ranges, but the fluctuation at the nearest neighborhood level was somewhat more drastic.
is indicates that the nearest neighborhood scale is more suited for capturing the subtle effects of urban transport facility density on accident occurrences. A similar report on traffic accidents in Seattle shows a rising trend in both the 3-way and 4-way intersections [78], which is not evident in this study's findings. Taken together, these results suggest that the impact of intersections on traffic safety varies according to the complexity of their surrounding region. Nevertheless, for the "Trafficlight" variable, the two scales selected in this study produce similar effects of traffic lights essentially, while the result at the nearest neighborhood level better demonstrates the complex variation in bus stations and intersections.
For the two variables of mixed land use shown in Figure 8, both "D1" and "D2" reach a stability of 4.5 on the Yaxis after rising along the X-axis. However, from the curve, "D2" reaches stability "D1" earlier than relatively. is is because the D1 formula only measures the weight of each land use type, which is the exponential of the Shannon entropy. In the meanwhile, D2 considers the richness of the land use and the relative abundances of the POI [15]. e  5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Local level Nearest neighborhood level (k) Figure 7: Variation in density dummy variables and interpretation/comparison of two scales as n.trees changing from 1 to 30.  data used in the study confirms the diversity between three land use types containing residential, office, and commercial land; therefore, the tendency of D2 is more secure than that of D1. is result is consistent with the Chen and Zhou's work [78] on a crash frequency that shows a positive correlation between land use mix and accident. It also aligns with the increasingly stable trend of partial dependence in the Ding et al.'s study [12] on traffic accidents in Seattle. e other variables represent 8 land uses shown in Figures 9 and 10. As a whole, all of these variables, except "Residential" and "EduInstitu," show a significant positive correlation with accidents. e partial dependence diagrams of these variables show an overall increasing and then stable trend after some fluctuations. e two variables of "Residential" and "EduInstitu" are negatively correlated at the local level although positively correlated at the nearest neighborhood scale. is suggests that several land types, "Healthcare," "Greenspace&Park," "Government," "Financial," "Servifacilities," and "Shopping&Leisure," may overall lead to an increase in the number of accidents at lower densities, while the number of accidents will not continue to increase after reaching a certain density. us, variables will likely function strategically at the turning point in the graphs.
In Figures 9 and 10, other than "Residential" and "EduInstitu," the six variables show consistent evolutionary trends at both the local and nearest neighborhood levels. It suggests that the partial dependency diagrams fully explain the role of each variable in the model with consistent influence across the two scales. e differences between the variables of "Healthcare," "Shopping&Leisure," and "Government" are highly uniform across the two scales. ey have a slight decrease in the local level scale compared to their nearest neighborhood scale and reach stability after a range of "0-20," "0-60," and "0-30," respectively. For the "Financial" and "Greenspace&Park" variables, the nearest neighborhood scale is more revealing of their subtle changes before a turning point. It is particularly evident for the "Greenspace&Park" variable, as there is a rapid decrease in traffic accidents when the density of "Greenspace&Park" is    around 10 per square kilometers. erefore, to control for safer traffic conditions in the area, the land use of greenspace and parks can complement the role of transport facilities. It is worth noting that the "Financial" and "Government" variables demonstrate opposite effects at the two different levels in Figure 9. If their stationary points are similar and their corresponding likelihoods also match, they show a steep decline in the local level and an abrupt rise at the neighborhood level.
e partial dependence plot trend shows a negative correlation at the local level scale and a positive correlation at the nearest neighborhood level before reaching sustained stability. It is likely that the variation in the number of accidents is uncertain at both scales and may increase or decrease with a sudden boom in residential and educational institutions. However, the description of the accident relation with residential and educational units varies in the literature, and they sometimes conflict with each other. It has commonly been assumed that a positive correlation between the number of residential units (higher population density) and pedestrian crashes [10]. Other empirical cases, in contrast, demonstrate the direct opposite [6]. Similarly, some studies suggest that students are more likely to be at risk in areas with high school density due to irregular traffic crossing behaviors and low safety awareness. Despite this, Ukkusuri [4] presented contrasting experimental results. Nevertheless, the trend turning point in this study's partial dependency diagrams reveals that the overall number of accidents will no longer increase after the number of educational institutions reaches 70 per square kilometer and the number of residential areas reaches 50.
Concerning the partial dependence plot of the density dummy variables, all the variables show the linearities, as demonstrated in Figures 11-13. e three linear relationships are explained as follows:   Figure 11: Partial dependence plots of density dummy variables of road structure facilities on accidents.  (1) e plots of "Busstation.High," "Government.High," "Financial.High," "Servifacilities.High," and "EduInstitu.High" at the local level and "Greenspace&Park.High," "Trafficlight.High," "Government.High," and "Residential.High" at the nearest neighborhood level are linear parallel to the X-axis.
(2) However, "Healthcare.High" at the local level, as well as "Busstation.High" and "Financial.High," at the nearest neighborhood level present parallel lines with a high front and a low back, connected in the middle by a plumb line. However, these three linearities neither describe a positive or negative correlation between high-density land uses and accidents nor present fluctuating intervals and meaningful turning points as the x-axis changes. e practical guide can hardly represent in the partial dependence lot of density dummy variables.

Conclusion
Road safety is critical to the health and wellbeing of people. To this end, a large and growing body of literature has investigated the leading causes and mechanisms of traffic accidents. Most research on traffic accidents has emphasized a complicated relationship between land use and urban transportation. In this study of the Suzhou Industrial Park (SIP), accident data provided by the SIP traffic police bureau were used to build a GBM machine learning model to identify the relationship between traffic accidents and land use. e research process includes the following:  (1) Processing the traffic accident data as well as land use and related facilities data. (2) Establishing a GBM model on the frequency of traffic accidents following by determining the model parameters.
(3) Analyzing each land type variable's contribution to accident frequency and comparing these with the explanatory degree of the variables, as the number of iterations in the model grows. (4) Discussing the estimated impacts of each variable on the accident intuitively according to the partial dependence plots at hand. e study has highlighted factors affecting accidents, geographical scale exploration, and model operation. e GBM analysis was conducted at the local and neighborhood scales to explore the overall validity of the geographical levels and the model fitting. is also included the effect of variables of transportation facilities, land use, land mix, and density on the accident outcomes. e thirteen variables, including road facilities, land types, and some POI facilities, have been involved in two spatial scales that are bounded by TAZ units (local) and iessen polygons (nearest neighborhood). e results show that they all impacted accident occurrence at both scales, among which the more critical factors include categories of residential land, consumption and leisure land, and green parks. However, the experimental results at the two scales reflected vital differences and similarities at various experiment points. Among the rankings of relative importance, "Trafficlight," "EduInstitu," "Healthcare," "Intersection," and "Servifacilities" all have shown a degree of interpretation from 7% to 13% and existed in the crucial places of rankings on both scales. However, "Greenspace&Park," "Residential," and "D2" differed significantly and showed abnormality of the results. When adjusting the complexity of the tree, some variables such as "Residential" and "Greenspace&Park" appeared to be more influential at the local level, while the nearest neighborhood level showed more activity for the variables of "Trafficlight," "Busstation," "Healthcare," and "D2." In the partial dependence plots, the variables of "Residential" and "EduInstitu" showed accident frequencies at both scales. ese results may be due to the fact that the spatial distribution of traffic accidents is uneven in SIP. Accident rates varied widely in each TAZ area. e large TAZ regions in the northern part of the study area and the dense TAZ regions in the southwest area showed the normal peak situation of the accidents, and the location distribution was scattered. e local level has been seen as more suitable for measuring variables where pedestrians and vehicles have fixed mobility periods and moderate flows, such as residential areas and green parks. One the one hand, the nearest neighborhood level could be applied to a small number of variables related to public service facilities at fixed locations, such as traffic lights and bus stops. In other land uses such as financial networks, shopping, and leisure, where the sample size was extensive with a complexity of hierarchy, the scale could be modified according to the overall land use requirements. erefore, this research suggests that it will be worth considering applying the nearest neighborhood scale with the boundary of iessen polygons in addition to the commonly used TAZ areas when examining traffic accidents or even traffic safety research of municipal engineering projects. When planning for a smaller geographical area, these different scale ranges might help confirm the settings and enrich the understanding of the study area's spatial structure to improve overall road safety.
Research on accident models has been developed using an advanced technique. GBM is a machine learning model that has been promoted to use rapidly in recent years. It particularly allows to validate existing models by ranking the importance of the coefficients and the variation in the model fit. Based on the introduction of multiple variables in the past studies, this research used the ordering of explanatory variables, tested the fitting degree of each variable by changing the parameter setting and partial dependence graphs, and comprehensively built an application model suitable for the current land use and road situation. Since a growing number of studies extensively analyze traffic accidents in different regions, the findings of this study could be compared with some of them to confirm its consistency and deviation. In this way, the analysis results of this study could be validated against others of its kind. e results of GBM included the coefficients of the variables under existing parameter settings. GBM was useful for this traffic accident study and positively contributed to understanding the relationship between urban form and traffic accidents. It is suggested that policymakers pay further attention to the benefits of using advanced methods in accident research than traditional means and understand the cause of this discrepancy to find the most efficient method in practice.
is study has several limitations that one should take into account for future studies. First, this study confirmed that the GBM model is only useful when it applies to regression and classification problems with the sufficient number of parameters from existing studies. Similar to other linear models, the coefficients of the variables were the only representative of the importance and influence of the dependent variable within one single model, and their values could not be used as an absolute reference for some practical applications. It is conceivable that if this model is applied to an emerging research subject, and the reliability of the GBM's result could be somewhat limited because it would not be able to produce absolute results. e application of this technique depends heavily on previous results and experience because determining variables (causality) and selecting the most suitable model parameters would be difficult. erefore, this model might be beneficial for judging the relative fit of the identified variables, the relative importance between variables, and the internal interaction of the model parameters. When adjusting the complexity of the tree, the likelihood of variables fluctuated with the change in the number of trees but did not reach a stable value within the scope of the test. To ensure the integrity of the variables and the overall stability of the model, dummy variables that represent high density of land uses were introduced. However, the results of the density dummy variables were not satisfactory. e relative importance ranking was low. Also, the influence factors did not produce effective changes with the increasing number of trees, and they did not show meaningful fluctuation intervals and turning points in the partial dependence plot. Overall, this model could be best used for comparative purposes and might produce a more accurate model by adjusting parameters.
Second, the GBM occasionally presents accuracy and overfitting issues. e GBM predicts less accuracy than some regularization, polynomial regression, and partial regression methods [66]. It is also easy to overfit due to being relatively unconstrained in operation, causing a single decision tree to retain branches (without pruning) until it remembers the training data [62]. is needs to be treated carefully based on the different sizes and characteristics of the actual dataset when adjusting the parameters. As mentioned in the first point, it is worthwhile to explore varied applicable parameters to ensure the reliability of the model.
ird, the relatively small study area makes this finding less generalizable to other cities or regions of China, especially given the relatively unique layout of SIP although the exploration of geographical scale level is one of the important contributions of this study. In addition to the commonly used local level TAZ area, this research highlights the significance and usefulness of the nearest neighborhood level drawn from the iessen polygons zone, which can be used as a scientific and reasonable level scale. However, along with the results of this study, these levels have only been verified to apply to the Suzhou Industrial Park, and it might not be directly replicated to other regions in China.

Data Availability
e dataset utilized for this study is not publicly available due to the confidentiality agreement with the Suzhou Industrial Park government.
Disclosure e contents of this study reflect the views of the authors, who are responsible for the facts and the accuracy of the information presented herein. Xi'an Jiaotong-Liverpool University assumes no liability for the contents or uses thereof.

Conflicts of Interest
e authors declare that there are no conflicts of interest.