Comprehensive Evaluation of Urban Economic Development in Yangtze River Delta Based on Cluster-Principal Component Analysis

,e study of the Yangtze River Delta urban agglomeration provides a reference for regional construction. We set up a suitable cluster model and a comprehensive evaluation model to evaluate the urban economic development and provide the basis for the government to formulate and adjust the economic measures. We use the heat map of correlation coefficient to eliminate the economic indicators that are less correlated with per capita GDP and use grey correlation analysis to detect the degree of correlation of economic indicators. ,e optimal method is determined by comparing the results of K-means clustering and fuzzy C-means clustering. In addition, we use principal component analysis to rank the status of urban economic development. ,e results show that the relationship between the added value of primary industry, the added value of agriculture, forestry, animal husbandry and fishery, and per capita GDP is close to 0. Regional development level is not balanced. Chizhou, Xuancheng, Bengbu, and other cities lag behind in the economic rankings. Finally, it is suggested to reform the laggard cities and try to improve their economic strength from various economic indicators.


Introduction
e Yangtze River Delta is one of the key regions of economic development in China, which has abundant development potential. At present, there is still a lot of room for economic development in the Yangtze River Delta region. e economic level of the middle city lags far behind the world's fourth bay area.
e Outline of the Plan for the Integrated Development of the Yangtze River Delta Region emphasizes that "cintegration" and "high quality" are the keys to giving full play to its leading and exemplary role.
In terms of realistic measurement models, domestic and foreign scholars take regional integration, coordination, and competition of urban agglomerations and other spatial econometrics as the logical starting point to explore the collaborative development path of urban agglomerations.
Among them, Wu et al. [1], by building the Malmquist model and grey model, found that the green development efficiency of the Yangtze River Delta varies greatly and some cities are in a declining level in recent years, but the gap between different provinces is narrowing year by year. Zhang et al. [2] found that the difference of scale economy and structure economy is more prominent in the Yangtze River Delta High-Tech Zone. Fan et al. [3], by using GDP per capita, divide it into central cities, more developed cities, and less developed cities. Chao et al. [4] concluded by constructing the principal component model that the Yangtze River Delta has been developing towards low carbonization since its entry into WTO. Ma et al. [5] believed that most cities in the Yangtze River Delta region should strongly strengthen sustainable development. Foreign scholars creatively use the theory of breakpoint and geo-economic relationship to explore the spatiotemporal evolution characteristics of economic structure and spatial organization of urban agglomeration.
After consulting a large number of literature, we find that most scholars only focus on one or several aspects, but lack of research on the measurement of the overall economic development level of the Yangtze River Delta. Taking this as a starting point, this paper analyzes the impact on urban economic development under the background of integration, finds out the economic advantages of each city, and makes up for the shortcomings of economic development. It not only provides a demonstration for the coordinated development of national regions but also provides Chinese solutions for the coordinated development of global regions.

Analysis of Current Economic Situation
As high-quality development sets new requirements and urbanization enters a new stage, in 2019, the GDP of the Yangtze River Delta region accounted for 20 percent of the national total, and the per capita GDP of the region increased year by year. Derived by the vigorous development of digital economy, the integration process of many fields within the region is accelerating, giving rise to a series of high-tech, high-sales new enterprises. Its overall valuation ranks the first among China's urban agglomerations, accounting for 16 of the world's top 500, with optimistic development prospects and infinite potential. Figure 1 is the GDP line chart of the Yangtze River Delta and its internal regions from 2010 to 2019. e GDP of the four provinces has been growing steadily since its establishment in 2010. Among them, the GDP growth of Jiangsu Province is the most obvious, and the GDP development of Anhui Province is the slowest, and the overall level is the lowest.

Data Source and Description.
e index data are all from the statistical yearbook of the corresponding cities in 2019. After referring to relevant literatures and selecting variables [6][7][8][9][10], we introduced the added value of various industries to measure the overall industrial economic development, introduced the added value of agriculture, forestry, animal husbandry, fishery, industry, construction industry, wholesale, and retail industry as specific economic indicators and introduced the total import and export volume to measure the regional trade balance level. We have removed the effect of the size of the workforce on the level of economic development. Figure 2, we process the data through three steps to ensure the feasibility of subsequent studies.

Indicator
System. Data symbols and interpretation are shown in Table 1.

Methods.
e rank correlation coefficient is calculated. Run n experiments on random variables X and Y, get the sample (X i , Y i ), among them, i � 1, 2, 3, . . . , n. Assume that X i , Y i , their rank orders were, respectively, p i and q i (p � (1/n) n i�1 p i , q � (1/n) n i�1 q i ). en the rank correlation coefficient of random variables X and Y for this group of samples is (1) If there is no same rank, the following formula can be used to calculate:   2 Complexity As X and Y get closer and closer to the strictly monotonic function relationship, the Spearman rank correlation coefficient will get larger and larger numerically. When X and Y had a strictly monotonically increasing relationship, their Spearman rank correlation coefficient was 1. On the contrary, when X and Y have a strictly monotonically decreasing relationship, the Spearman rank correlation coefficient is −1. When the Spearman rank correlation coefficient is 0, it means that with the increase of X, Y has no tendency to increase or decrease, in other words, there is no linear correlation with Y.

Results
Analysis. According to the above steps, Figure 3 is a heat map drawn with Python software. In Figure 3, the correlation coefficient between the added value of the primary industry and the added value of agriculture, forestry, and animal husbandry and fishery X 2 and the per capita GDP is close to 0, which should be eliminated. Because with the development of modern economy, the agricultural economy plays a small part in the economic growth.

Economic Index Factor Analysis Based on
Grey Correlation Analysis e grey correlation analysis method can judge whether the connection is close by determining the similarity degree of the geometric shape between the reference data column and several comparison data column. As the analysis is conducted according to the development trend, this method has no requirement on the sample size and does not need the typical distribution law, and the calculation amount is relatively small, so this method has no requirement on sample size. We use this method to test whether the economic indicators selected above are appropriate.

Methods.
Each index of the data has different dimensional orders of magnitude, and direct analysis will affect the results. erefore, the data are firstly normalized [11][12][13]: where d * ij is the element in the i-th row and j-th column in the normalized matrix and x ij is the element in the i-th row and j-th column in the original data matrix. In addition, i, j � 1, 2, . . . . . . , n.
e pretreated per capita GDP G and the relevant economic indicators X 2 , X 3 , and X 5 -X 9 after X 1 and X 4 are excluded are formed into an analysis matrix: where G n represents the observed value of per capita GDP in the n-th city and X nm represents the observed value of the mth index in the n-th city. e analysis matrix [G, X] is formed into an initial value matrix [G * , X * ]: Form det ij into a correlation coefficient matrix ω ij , where det min and det max are the minimum and maximum values in the correlation coefficient matrix ω ij , respectively. ρ is the resolution coefficient, which is usually 0.5.

Results
Analysis. MATLAB program is used to solve the correlation degree table of each related economic index to GDP per capita, as shown in Table 2. In Table 2, the correlation degrees of indicators X 2 , X 3 , X 5 , X 6 , X 7 , X 8 , and X 9 are all greater than 0.9, indicating that other economic indicators are closely related to per capita GDP, and the indicators are selected appropriately.

An Empirical Analysis of the Current Situation of Urban Economy in the Yangtze River Delta Region
e fuzzy clustering analysis method is widely used in exploratory research. By using fuzzy set theory to analyze data, similar samples can be grouped into one group. e result is simple and intuitive. After consulting the relevant literature, we find that many scholars at home and abroad generally use the cluster analysis method in the study of urban agglomeration development.

Mark
Meaning G GDP per capita X 1 Value added by the primary industry X 2 Value added by the secondary industry X 3 Value added by the tertiary industry X 4 Agriculture, forestry, animal husbandry and fishery X 5 Industrial added value X 6 Construction added value X 7 Wholesale and retail added value X 8 Total exports X 9 Total imports Complexity 3 Among many fuzzy clustering algorithms, the C-means clustering algorithm has been applied more and more widely and successfully in many fields in recent years. In most cases, objects in the data set cannot be divided into distinct clusters, and probabilistic methods can also obtain such weights, but sometimes it is difficult to determine an appropriate statistical model, so the use of fuzzy C-means with natural and nonprobabilistic characteristics is a better choice. e K-means clustering method has a strong degree of interpretation. When the clusters are dense, spherical, or lumpy, and the difference between clusters is obvious, its clustering effect is good. We take into account that each method has its drawbacks. In order to make the research results more accurate, based on Henderson's study on city scale [14], we study the status quo of urban economy in the Yangtze River Delta region and determined the optimal method by comparing K-means clustering and fuzzy C-means clustering.

Economic Status Analysis Based on Fuzzy C-Means
Clustering Algorithm and v i represent the cluster center of the i-th class. en a fuzzy C-means [15] clustering of X is to find the minimum of the following objective function: where d ik � ‖x k − v i ‖ is the Euclidean distance from the k-th sample to the cluster center of the i-th class.

Research Ideas.
Determine the number of clusters c and the number of m, and initialize the membership matrix where v is the cluster center, i � 1, 2, . . . , c and m > 1.
Update the initial membership matrix U 0 : For a given ε > 0, in the actual calculation, the initial value is iteratively calculated, and when max |u t ik − u t−1 ik | < ε, the algorithm is terminated. If u jk � max u ik , then x k ∈ j − th class.

Empirical Analysis.
After removing the two indicators, the 27 cities were divided into three categories, namely, the first level, the second level, and the third level. Among them, the first level indicates that the city has the best level of  economic development, the second level indicates that the city's economic development is at a medium level, and the third level indicates that the city's economic development is poor. e classification of 27 cities is obtained, as shown in Figure 4. From Figure 4, it can be seen that the economic development of Shanghai is the best among 27 cities, followed by 5 cities in Jiangsu and 3 cities in Zhejiang. Anhui's cities are all in the third level.

Analysis of Economic Status Based on K-Means
Clustering Method 6.2.1. Research Ideas. Select the initial center of mass. See model establishment for specific methods. Calculate the mean value of the corresponding samples assigned to each centroid in the previous step as the new centroid. Calculate the difference value of these two centroids, and repeat the steps until the value is less than the threshold.

Model
Building. k observations randomly selected from the sample set are taken as the initial centroid [16][17][18]. Each observation value is randomly assigned an initial centroid, that is, the initial cluster, and then enters the update cycle step to calculate the mean value as the random initial centroid of the cluster. After selecting the initial set m (1) 1 , m (1) 2 , . . . , m (1) k of k means according to the above steps, the algorithm will alternate between the steps, including the cycle between the following two steps: (i) Assign steps: the least squares Euclidean distance is calculated, and each observed value x p is assigned to the cluster S (t) i with the closest mean value: where x p is the p-th observation value, the i-th cluster in the t-th cycle, and m (t) i is the i-th cluster average in the t-th cycle. (ii) Update steps: recalculate the mean of the observations assigned to each cluster: where m (t+1) i is the i-th cluster average in the t + 1 cycle.
When the assignment is no longer changed, the algorithm has converged, and clustering has been completed.

Empirical Analysis.
e clustering results are obtained by Python program. See Figure 5 for details.
It can be seen from Figure 5 that the economic development of Shanghai is the best among 27 cities, followed by 13 cities such as Nanjing, Wuxi, Changzhou, and Zhoushan, while the economic development level of Hefei, Bengbu, Yancheng, and Chuzhou is relatively weak, and most of the weaker cities are from Anhui Province.

Analysis of Economic Status Based on Principal
Component Analysis 6.3.1. Research Ideas. Figure 6 shows the specific steps of principal component analysis. In this way, the determination of weight is objective and reasonable, and it overcomes the defect of some evaluation methods in determining weight. e higher the correlation degree between indexes, the better the effect of PCA is [19][20][21][22]. is method linearly converts multiple metrics into a new set of fewer metrics. ey are independent of each other and retain most of the information of the original index data. Generally, select a line that minimizes the average square distance from point to line, and select the next line similarly from the direction perpendicular to the first one. Repeating this process will produce an orthogonal basis, on which each individual dimension of data is irrelevant, and these basic vectors are called principal components.

Research Technique.
e preprocessed data matrix X is shown below: e mean vector x and covariance matrix S of the data matrix are calculated: e eigenvalue λ and variance contribution rate k of covariance matrix S are calculated: Sorting k from big to small, and calculating the cumulative variance contribution rate, should make the first m cumulative variance contribution rate greater than 85%, then the first m principal components y can be used to explain all indexes [23][24][25][26][27]. e unit orthogonal eigendirections a 1 , a 2 , . . . , a p of the eigenvalue λ are coefficients. e expression of principal component y is as follows: Complexity 5 y 1 � a 11 x 1 + a 12 x 2 + · · · + a 1p x p , y 2 � a 21 x 1 + a 22 x 2 + · · · + a 2p x p , . . . y p � a p1 x 1 + a p2 x 2 + · · · + a pp x p , Calculate f by the weighted product method: where w is the vector after normalization of the first m variance contribution rates. Table 3 shows the results obtained after calculation by MATLAB program. It can be seen from Table 3 and Figure 7 that the three principal components y 1 , y 2 , and y 3 cover most of the variable information.

Analysis of Results.
erefore, these three principal components replaced the original eight indexes, and the original eight dimensions were reduced to three dimensions, which played a role in dimension reduction [28][29][30][31]. Table 4 shows the first three principal component coefficients.
erefore, the expressions of the first three principal components y 1 − y 3 are obtained as follows: y1 � 0.034g + 0.147x 2 + 0.269x 3 + 0.151x 4 + 0.060x 5 +0.684x 6 + 0.526x 7 + 0.367x 8 , Table 5 shows the specific values (partial). In Figure 8, the darker the color, the higher the score. According to the comprehensive scores and rankings of economic development of each city in Table 6, Shanghai ranks first and scores far ahead of other cities, so it can be said that it is the leader of the Yangtze River Delta urban agglomeration. Most cities in Zhejiang and Jiangsu have good economic development level, while most cities in Anhui Province have weak economic development by comparison, and only Hefei, the provincial capital, is in the middle and upper class.

Conclusion
Based on the above research, we find that no matter which classification method is used, Shanghai's economy is clearly ahead that of other cities. erefore, the further use of the principal component analysis method, the result is that Shanghai ranked first, Ningbo, Hangzhou ranked second, third, and Chizhou, Xuancheng, Bengbu and other cities in the economic ranking behind. In order to realize the integration of the Yangtze River Delta, it is obvious to reform these backward cities and try to improve their economic strength from various economic indicators [32][33][34]. e novel feature of this paper is that multiple evaluation clustering methods are used to comprehensively evaluate the level of urban development to make the results more objective. On the research of international regional development problems, due to multifarious factors influencing the regional development level, so we are cautious of comprehensive selection index data at the same time because of considering the rationality of the method, by using a variety of methods, a comprehensive measure of regional development to further ease international regional development imbalances, promote international cooperation. Our research provides theoretical reference for the further construction of a community with a shared future for mankind.
Rational development planning among regions is necessary but should not be relied on blindly. Combes [35] found that urban agglomeration would hinder the economic growth of industrial and service sectors in France, while Gardiner et al. [36] believed that different conclusions may be drawn when discussing the relationship between spatial agglomeration and economy without considering spatial scale. erefore, how to properly use the geographical advantages to achieve a win-win situation is worth thinking about. Based on the above research, we hereby cautiously propose the following suggestions: Firstly, carry forward the local advantages, acknowledge the differences, but adhere to the overall integration of development. e gap between regions of the Yangtze River Delta will be narrowed, coordinating inter-city relations through strengthening the construction of social infrastructure. Promote the flow of human resources to underdeveloped areas, pay attention to the siphon effect between cities, so as to change the scale disadvantage of Anhui and other regions, so that they can get more benefits from the core cities.
Secondly, local development drives overall development. Actively promote the construction of demonstration areas, especially happy technology industries and key industries. By taking the lead in pilot projects at the local level, we will innovate a new model of integrated development while maintaining ecological balance. Adhere to sustainable, integrated development, pay attention to output efficiency, so as to drive the economic development of backward areas.
irdly, break administrative boundaries. e government has improved the platform for regional cooperation and the policy service system and actively promoted resource sharing. In terms of financial policy, the government should vigorously support emerging industries and industries with promising development, help enterprises connect with market resources, and conduct upstream and downstream cooperation.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.