The Spatial Patterns of Service Facilities Based on Internet Big Data: A Case Study on Chengdu

In the context of the mid-late development of China’s urbanization, promoting sustainable urban development and giving full play to urban potential have become a social focus, which is of enormous practical significance for the study of urban spatial pattern. Based on such Internet data as a map’s Point of Interest (POI), this paper studies the spatial distribution pattern and clustering characteristics of POIs of four categories of service facilities in Chengdu of Sichuan Province, including catering, shopping, transportation, scientific, educational, and cultural services, by means of spatial data mining technologies such as dimensional autocorrelation analysis and DBSCAN clustering. Global spatial autocorrelation is used to study the correlation between an index of a certain element and itself (univariate) or another index of an adjacent element (bivariate); partial spatial autocorrelation is used to identify characteristics of spatial clustering or spatial anomaly distribution of geographical elements. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is able to detect clusters of any shape without prior knowledge. (e final step is to carry out quantitative analysis and reveal the distribution characteristics and coupling effects of spatial patterns. According to the results, (1) the spatial distribution of POIs of all service facilities is significantly polarized, as they are concentrated in the old city, and the trend of suburbanization is indistinctive, showing three characteristics, namely, central driving, traffic accessibility, and dependence on population activity; (2) the spatial distribution of POIs of the four categories of service facilities is featured by the pattern of “one center, multiple clusters,” where “one center” mainly covers the area within the first ring road and partial region between the first ring road and the third ring road, while “multiple clusters” are mainly distributed in the well-developed areas in the second circle of Chengdu, such as Wenjiang District and Shuangliu District; and (3) there is a significant correlation between any two categories of POIs. Highly mixed multifunctional areas are mainly distributed in the urban center, while service industry is less aggregated in urban fringe areas, and most of them are singlefunctional or dual-functional regions.


Introduction
At present, China's social economy has entered a period of rapid development, and the process of urbanization has accelerated significantly. At this stage, China still witnesses a rapid development with an urbanization rate of 30%-70%. By 2030, the urbanization rate is expected to exceed 70%, and more than one billion people will live in cities [1]. e level of urbanization has become a measure of regional economic development and is also key to solving unbalanced and insufficient development [2]. e report of the Nineteenth National Congress of the Communist Party of China emphasized giving full play to the driving ability of city in effectively promoting the development of urbanization and impulsing the coordinated development of large, medium, and small cities and small towns with urban agglomerations as the main body to meet people's expectations for a better life [3]. e urban spatial pattern is the expression of the relationship between the material environment, functional activities, and cultural values in the city [4]. e urban spatial pattern will affect further improvement of the level of urbanization and then affect the development capacity of the regional economy [5]. In recent years, the informatization level of urban infrastructure service facilities has been greatly improved, making it easier to obtain complete spatial data such as public services and life services. At present, by obtaining geospatial data integrated by online map service providers such as Gaode Maps and Baidu Maps, and online life service platforms such as Dianping.com and Meituan, the analysis and exploration of urban space-related issues is gradually becoming a research hotspot in this field [6]. Among them, Point of Interest (POI), as an emerging spatial data source, is able to explore the overall spatial pattern of the city, while at the same time, conduct spatial identification and quantitative research on the city center to finely identify the high spatial distribution within the city. e feature field is of great significance [7]. e current research on POI data is mainly focused on the feature identification of various functional spaces in hotspots and the analysis of dynamic changes [8][9][10], as well as the research on the interaction between different types of hotspots and the spatial interaction network formed by them. As a central city in the country and one of the core cities in the Chengdu-Chongqing Double City Economic Zone [11], Chengdu is a strategic highland for the country's western development. For this reason, this research, taking Chengdu as the object, uses spatial data mining techniques such as univariate and bivariate spatial autocorrelation, DBSCAN clustering, etc. to quantitatively analyze and reveal the distribution characteristics of spatial patterns and their coupling effects based on big data of service industry facilities POI. In this way, the effects and shortcomings of the development of Chengdu are discussed with a view to putting forward optimization suggestions on the rational development and utilization of the urban spatial pattern and to providing decision-making reference for improving the residents' quality of life.

Basic Information of Research Area.
e research object of this paper is the downtown area of Chengdu, including 11 administrative units in Jinniu District, Chenghua District, Qingyang District, Wuhou District, Jinjiang District, Pidu District, Xindu District, Wenjiang District, Longquanyi District, Shuangliu District, and Qingbaijiang District. e geographical location is between 102°54″-104°53″E, 30°05″-31°26″N [12]. e area of the research is 695.53 km 2 , accounting for 25.32% of the total area of the city (Figure 1). e research area is located in the subtropical monsoon climate zone, with long summers and short winters, mild climate, long frost-free period, short average sunshine time, annual average temperature of 16.5∼17.9°C, and annual rainfall of 643.3∼1256.2 mm. In 2018, the total population of downtown area of Chengdu was 10.304 million, with an average urbanization rate of 83.32% [13].

Data Sources.
e Internet big data selected in this study are the city POI data of AutoNavi Map in September 2020.
rough its official open API, Python code is used to obtain and filter out four categories of POI data for catering services, shopping services, transportation services, and scientific, educational, and cultural services in the downtown area of Chengdu. Each piece of POI data contains such attribute information as name, latitude and longitude, category, and administrative division. After the abovementioned acquired data are subjected to coordinating correction and data cleaning processing such as deleting duplication, a POI spatial database in the central city of Chengdu is established, with a total of 219,710 pieces of POI data (Table 1). e data concerning administrative division of the study area come from the National Basic Geographic Information Database (https://www.webmap.cn/) of the National Basic Geographic Information Center.

Research Methods.
First, use the Fishnet tool of ArcGIS to create a grid covering the area. According to the spatial characteristics of the study area and referring to previous research results, the grid size is set to 1 km × 1 km, and the study area is divided into 3893 grids ( Figure 2). Count the number of POIs contained in each grid and then analyze them based on the above data.

Global Spatial Autocorrelation Analysis.
Moran's I index expounds the first law of geography in a mathematical sense: everything is related, and things that are closer are always more related than things that are far away [14]. Moran's I index is mainly used to analyze the statistical distribution of spatial data to show its aggregation effect. Its formula is as follows: where x i is the value of a certain spatial unit; x is the mean value of all variables; S 2 is the variance of the variables, and n i�1 n j�1 w ij is the sum of the spatial weights of all variables. e value of Moran's I index will be normalized to [−1,1] during the analysis process: I > 0 means that the spatial distribution of the elements is positively correlated, and the larger the value, the more obvious the spatial correlation. When I � 1, the spatial distribution of elements is completely positively correlated; when I < 0, it means the distribution of the elements is spatially negative, and the smaller the value, the greater the spatial difference. When I � −1, the spatial distribution of elements is completely negatively correlated; I � 0 means that the space is irrelevant. e bivariate Moran's I index is a modification of the traditional univariate Moran's I index. e bivariate Moran's I index is mainly used to study the correlation between an indicator of a certain element in an area and another indicator of its adjacent elements [15]. e formula is In the formula, y is the average value of the second element; n j�1 W ij , other elements have the same meaning as those in formula (1).

Local Spatial Autocorrelation Analysis.
ere is a certain spatial lag in spatial relations, which is manifested as stronger correlation within a certain area. Analysis of Local  Indicators of Association (LISA) explores spatial correlation features among elements at Local scale to identify spatial cluster or spatial anomaly distribution features of geographical elements [16]. Univariate LISA spatial analysis is based on the relationship between the local region and the surrounding environment to find out the region with lag within the study area [17], including high-high (H-H) relationship, that is, the element value and the surrounding element value are both high; high-low (H-L) relationship, element value is higher and peripheral value is lower; lowlow (L-L) relationship, elements and their peripheral values were all low; and low-high (L-H) relationship, the element value is low but the peripheral factor value is high [18]. Bivariate Lisa spatial analysis can show the effect of an index of an element on the second index of another element, and the formula is as follows: where I i represents the local spatial autocorrelation coefficient of research area i; x i and x j represent the observed value of the same geographical attribute on position I and j, respectively; x refers to the average value of x observations; σ 2 is the variance of x; and w ij is the spatial weight matrix.

DBSCAN Spatial Clustering Algorithm.
As one of the most representative clustering algorithms based on density, the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm was proposed in 1996 [19], because of its better explanatory nature and prioritization.
With the ability to find clusters of any shape under the condition of experimental knowledge (such as K-Means and other methods need to set the number of clusters in the data set in advance), and the ability to accurately identify discrete points, the algorithm was proposed to obtain the relevant data mining field and that was what the researchers sought after. At present, the DBSCAN algorithm has been widely used in the fields of agriculture, physics, chemistry, and social sciences [20][21][22][23], but reports of its application in POI clustering research are still rare. For this reason, the DBSCAN method is selected in this study for the cluster analysis of the four types of service industry facilities in the central city of Chengdu. e DBSCAN algorithm mainly involves two parameters: Min points and E PS . Min points means, after repeated experiments, the user eliminated the situation that the set value of Min points was too small, which led to a few super large clusters and many small clusters in the clustering results. On the premise that all the clusters were as similar as possible, the clustering results with as many clusters as possible were selected to finally determine the appropriate value of Min points [24]. E PS means the radius of the research field, mainly determined by the Euclidean distance between objects and the distance of descending order K; Euclidean distance is the distance between two points A(X 1 , Y 1 ), B(X 2 , Y 2 ) in space, and its calculation formula is as follows: where X 1 , X 2 , Y 1 , and Y 2 represent the abscissa and ordinate of the A, B two points respectively; the descending order K distance represents the Euclidean distance of the nearest K'th point to the object P point, i.e., K-dist. K is the above Min points . e value of K-dist reflects the density of the area where the point is. e smaller the value is, the denser the distribution of sample points in the area. e Euclidean distances between object point P and nearby K'th points were rearranged according to the order of values from large to small, and the K-dist figure was made. Since points on the same horizontal line or a straight line with a small slope were considered to be in the same density area, the value corresponding to the first sag point or the point with sharp changes in the figure was the value of E PS in theK-dist figure. DBSCAN algorithm defines cluster as the biggest density of connected points, which can divide regions with sufficient density into clusters and eventually form clusters of arbitrary shapes. To find a collection of linked density, the DBSCAN algorithm allows an arbitrary object P to cluster from the data. Considering P is, namely, the core object or the center of a circle, the POI point, in the circle with a radius of E PS , is not less than Min points and the algorithm returns to a connected set of density, whose objects are represented as the same variety. If P is not the core object and no other objects are reachable from P density, P is represented as a noise. DBSCAN algorithm does the above processing for each unscanned point. Finally, objects connected in density are represented to the same cluster, while objects not contained in any cluster are noise. For any core object in the data set, you can return a collection of density-linked objects.

Spatial Autocorrelation Analysis of POI in Chengdu.
e weight tool in GeoDa software was used to create the spatial weight files of each study area. e K-nearest criterion was selected to establish the spatial weight matrix and the confidence level was set to 0.05 to perform the univariate and bivariate spatial autocorrelation analysis on the four kinds of POI data.

Univariate Spatial Autocorrelation.
Moran's I index calculation results for different types of POIs in service industry facilities (Table 2). e calculation results in Table 2 show that the Z values of various types of POIs are greater than 1.96, i.e., the confidence is greater than 95%, while Moran's I index of the four service industry facilities' POIs are all positive, indicating that the spatial distribution of the four types of POIs is generally clustered. It has significant spatial clustering rather than random distribution. e local spatial autocorrelation analysis was further carried out to explore the spatial distribution characteristics of the service industry facilities in the central city of Chengdu, and the LISA spatial distribution maps of the four types of service facilities POI were drawn (Figure 3). According to the statistics of the LISA spatial distribution map, the proportions of the four types of POIs of H-H cluster area, L-L cluster area, H-L cluster area, and L-H cluster area are as follows (Table 3). e statistical results further show that there are H-H and L-H agglomeration areas in POI distribution maps of the four types of service facilities. e H-H agglomeration area accounts for the largest proportion and the concentrated distribution area is between 7% and 10%, while the L-H agglomeration area accounts for 0.2% to 0.7%. L-L clusters appear in shopping services and transportation services; H-L cluster area is only distributed in POI of scientific, educational, and cultural service facilities.

Bivariate Spatial Autocorrelation.
e calculation results of global Moran's I index of various service industry facilities (Figure 4). e calculation results show that Moran's I index of six bivariate passes the 5% significance test, and the values are all greater than 0.5, showing a significant positive spatial correlation. Among them, the spatial correlation between transportation services and scientific, educational and cultural services is the most significant, with Moran's I index reaching 0.706, while the spatial correlation between shopping services and scientific, educational, and cultural services is the least significant, with Moran's I index value of 0.574.
Furthermore, the bivariate LISA is used to analyze the spatial abnormal relationship between a central service facility element and another surrounding service facility element ( Figure 5). e results show the following: (1) e LISA spatial distribution results of shopping and transportation services, shopping and scientific, educational, and cultural services, as well as shopping and catering services show obvious homogeneity, that is, the H-H cluster area is concentrated in the first ring road of Chengdu, including Internet celebrity attractions such as parks, Sino-Ocean Taikoo Li Chengdu, and high-end residential areas such as Poly Spring Flower Language and BNBM International. In addition, there are some areas outside the first ring road and inside the third ring road, such as Pidu District and Wenjiang University Town; Airport and Nanhu Park; Longquanyi District Vocational School area; and dense residential areas of Huajin Avenue in Qingbaijiang District and other areas. Analyzing the L-L cluster area of LISA spatial distribution results of shopping and transportation services, shopping, scientific, educational, and cultural services, shopping and catering services: the L-L cluster areas of the three are scattered and independently distributed in the fringe area of Chengdu, and absolutely most areas are rural areas. (2) e H-H clustering of scientific, educational, cultural, and transportation services, scientific, educational and cultural, and catering services mainly covers the north, west, and south of the first ring road and the third ring road, such as Tianfu Square, Chunxi Road, and other famous attractions. Compared with the result of (1), the LISA H-H cluster area of scientific, educational, cultural, and transportation services and catering services also includes high-end residential areas east of Jinjiang River Road in Jinjiang District. e LISA spatial distribution results of scientific, educational, cultural, and transportation services show that there is no L-L cluster area, and its L-H cluster area is mainly distributed in industrial areas and around individual colleges, such as the industrial area around the Pidu Science and Technology Park, and the Sichuan Institute of Media (Huali Campus), Chengdu

Identification of POI Clusters in Chengdu and Its Spatial
Pattern Analysis. Based on the DBSCAN algorithm, cluster the POIs of four types of service industry facilities in  downtown area of Chengdu and use the standard deviation ellipse method to quantitatively analyze the spatial distribution pattern of the POIs of various service industry facilities (Table 4). 46 clusters were identified out of 50,303 catering service facilities, and the number of dining spots contained in each cluster was used as the scale indicator of the cluster. e overall distribution pattern of catering clusters is "one center, eight scattered, small groups" (Figure 6(a)). 77 clusters were identified among 109,581 shopping service facilities. e overall clustering of shopping services presents a distribution pattern of "one circle, four cores, seven scattered, small groups" (Figure 6(b)). e shopping service cluster centers in Chengdu are mainly concentrated in the third ring road, roughly forming a north-south ellipse. e number of clusters in the "one circle" accounted for 65.76% of the total. e main areas covered are the first circle of Chengdu (Jinniu District, Chenghua District, Qingyang District, Wuhou District, and Jinjiang District), Longquanyi District, Pidu District, and other parts of the second circle. e number of clusters of POIs in "four cores" is between 2480 and 6520, which are mainly distributed in the southern part of Wenjiang District, including the university towns of Southwestern University of Finance and Economics and Chengdu University of TCM in Wenjiang District in the west of Wenjiang City, near Shuangliu Airport in Shuangliu District, and Longquanyi Area near the intersection of G318 and G108. e number of POIs in the clusters of "Seven scattered, small groups" is between 620 and 2500, which is mainly distributed in the area east of G318 and the fifth ring road in Longquanyi District, namely, Sichuan Normal University, Sichuan Nursing Vocational College, and Sichuan Aerospace Vocational College as well as Sichuan Vocational College of Finance and Economics, and the real estate contiguous area, the eastern part of Chengdu ring expressway and the northern part of Yurong Expressway, the two clusters in the encircled zone of Chengwan Expressway and Chengpeng provincial road in Xindu District, and  in Pidu District and near universities and hospitals, such as the First People's Hospital of Shuangliu District Chengdu and Xihua University in Pidu District. 53 clusters were identified among 40,610 transportation service facilities. e overall transportation clusters present a distribution pattern of "one circle, seven small clusters" (Figure 6(c)). "One circle" is the largest cluster, roughly showing a northwest-southeast trend and accounting for 67.94%. With Chunxi Road, Taikoo Li, and Wangfujing Department Stores as the center, it radiates outward to the third ring road of Chengdu, which roughly corresponds to the traditional main urban area of Chengdu. e number of POIs in medium-sized clusters is mainly between 770 and 2260, which are mainly distributed in Wenjiang University Town, Shuangliu Airport in Shuangliu District and near Sichuan University, vocational colleges near Hangtian North Road in Longquanyi District, and Tonghua in Qingbaijiang District, the surrounding zone of Huajin Avenue and Huajin Avenue, the vicinity of Jinfurong Avenue and Rongdu Avenue, and the intermediate zone between G213 and Rongchang Expressway in Xindu District. e number of POIs in small clusters is between 35 and 400, which are mainly distributed along roads, subways, shopping malls and residential communities.
10 clusters were identified among 18,610 scientific, educational, and cultural service facilities. e cluster of scientific, educational, and cultural services generally presents a distribution pattern of "one large, four medium, and five small clusters" in a northwest-southeast direction ( Figure 6(d)). Large-scale scientific, educational, and cultural service centers accounted for 67.94%, which are mainly centered on Kuanzhai Alleys, radiating outward to the third ring road in Chengdu. e number of POIs in medium-sized clusters is between 520 and 1360, which are mainly distributed in the Wenjiang University Town, the west of Metro Line 3 in Shuangliu District, near the Tanghu Branch of Sichuan TV and Radio University, Sichuan Normal University in Longquanyi District, Sichuan Staff University of Science and Technology, in the area of Sichuan Tourism University, Chengdu Medical College of Xindu District, University of Petroleum, and other university towns. e number of small clusters ranges from 86 to 254 and is roughly distributed in some of the more developed communities and villages on the outskirts of the city, such as near the Qingbaijiang School of Chengdu Radio and TV University in Qingbaijiang District, and near Xinfan Town in Xindu District.
In addition, the DBSCAN spatial clustering algorithm obtained 2794 catering service noise points (accounting for 5.4% of the total number of this category), 2592 shopping service noise points (2.4% of the total number of this category), and 3280 transportation service noise points (accounting for 8.1% of the total number of this category), 1671 noise points for scientific, educational, and cultural services (8.9% of the total number of this category). e noise points indicate that they are discretely distributed and are not effectively connected to any clusters and are regarded as isolated points. Among them, there are relatively concentrated noise points in Longquanyi, Shuangliu District, and other areas. In the future development, these noise points will be integrated with existing clusters or form new clusters, which is a potential for future service industry development in downtown area of Chengdu [25,26].

Discussion.
rough the analysis of the formation mechanism of the spatial layout of the four types of service industry facilities in the downtown area of Chengdu, it can be found that driven by the behavior of maximizing benefits, it presents three characteristics: central driving, traffic accessibility, and dependence on population activity. e first is the driving force of the center. e distance from the city center and the urban development history and cultural heritage are important factors affecting the urban spatial pattern. e service industry facilities are highly and densely distributed in the five traditional districts, including Jinjiang District, Qingyang District, Jinniu District, Wuhou District, and Chenghua District. e central city refers to the first circle. With the increase of distance, the radiating effect of the city center on the surrounding area is constantly weakening. Although there are small dense areas in the second circle such as Wenjiang District, Shuangliu District, and Xindu District, the distribution is still close to the city center, and the trend of suburbanization is not obvious. e second is traffic accessibility. Along the roads, subways, and near bus stations, there are gatherings of catering service facilities, shopping areas, and rich scientific, educational, and cultural service facilities, such as within the third ring road of Chengdu, Chengdu-Chongqing Ring Expressway, G213, and other highways and various subway lines forming a multifunctional mixed area. e third is the dependence on population activity. e clustering abnormal areas in the spatial pattern are mainly distributed in residential areas, universities, and other areas with high population density and strong activity capacity, such as the residential area west of the fourth section of the east third ring road, high-rise residential buildings, and high-tech incubators around the Swan Lake Ecological Zone. erefore, it is necessary to formulate policies to rationally plan and guide the spatial layout of the service industry and to strengthen the implementation of the policy of "central city + suburban new city" spatial development level based on population distribution and urban spatial planning [27]. e non-core functions of the core area of Chengdu will be further unblocked, and the efficient flow of development factors and development resources will be promoted to enhance the spillover efficiency and drive radiation capacity of the central city and promote the balanced and coordinated development of the whole region.
Generally, the quantitative study of urban spatial pattern based on POI data can truly reflect its overall development status compared with traditional method research, which is helpful and of reference value to deepen the understanding of service industry functional clusters and the future development of the service industry in downtown area of Chengdu. On the other hand, with the disadvantages of excessive amount of data and low amount of information, and lack of such attributes as business scale, creation time, and usage, it is vulnerable for POI data to ignore the development characteristics and levels of the central area and difficult to conduct more in-depth analysis [28,29]. ere are still limitations of this research on the urban spatial pattern of the service industry facilities in the static space, and this paper fails to explore the evolutionary pattern of urban consumption vitality from the perspective of time dynamics, which can be further deepened in future research.

Conclusion.
e development of the service industry paves a solid foundation for urban economic development and is an important engine for enhancing urban competitiveness. Based on the POI data of the four subsectors of the service industry in the central city of Chengdu, this study comprehensively measures the spatial agglomeration characteristics of the service industry in the central city of Chengdu from the aspects of spatial aggregation and spatial autocorrelation. e research indicates the following: (1) Univariate spatial autocorrelation analysis and DBSCAN clustering results show that the four types of POIs all present a distribution pattern of "one center, multiple clusters" in spatial distribution. e "one center" mainly consists of Kuanzhai Alley and Chengdu Ocean Taikoo Li. As the center expands outward to some areas of the third ring road, the "multiple clusters" are mainly distributed in the welldeveloped areas of the second circle, such as Wenjiang District, Shuangliu District, and Xindu District. e remaining noise points are relatively isolated and scattered, and some areas have the potential to form service industry clusters in future.
(2) e bivariate Moran's I index is used to analyze the global and local spatial correlation between different types of POIs. e analysis results show that there is a significant correlation between any two types of service facilities POIs. e urban central area is consistent with the above-mentioned "one center" distribution. e service industry in the urban fringe areas is relatively weak, and most of them are singlefunction or dual-function mixed areas.
(3) e distribution characteristics of the service industry is subject to multiple factors. e geographical environment and urban development history have a spatially polarized distribution of the service industry facilities; the degree of traffic convenience determines the agglomeration and diffusion of the service industry in different regions; government planning measures and the attraction of business circles promote the development of the service industry and accelerate the difference in its spatial distribution.
Data Availability e data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Authors' Contributions
e authors Hao Li and Jianshu Duan contributed equally to this work as co-first authors.