Design of a Cultural Tourism Passenger Flow PredictionModel in the Yangtze River Delta Based on Regression Analysis

Cultural tourism has gained much attention in the last decade and has promoted the preservation of a variety of tangible and intangible assets of culture. In order to accurately predict the cultural tourism passenger flow in the Yangtze River Delta and improve its economic benefits, this paper designs the prediction model of cultural tourism passenger flow in the Yangtze River Delta based on regression analysis. Taking the competitiveness of passenger flow as the core, this paper selects 28 indexes from four aspects of cultural tourism brand resources, cultural tourism support and protection, and urban tourism market income to build the evaluation index system of influencing factors of passenger flow. *e principal component analysis method is used to simplify many related factors into a few uncorrelated factors to eliminate the multicollinearity caused by too many dependent variables; on this basis, the principal component regression model is constructed, and the determination coefficient is used to test the model fitting. Taking 15 cultural tourism cities in the Yangtze River Delta as the research object, the results show that the designed model has a good fitting degree, and the average error is only 0.41%, which can meet the needs of the prediction of cultural tourism passenger flow in the Yangtze River Delta. After the application of the prediction model, the foreign exchange earning amount of each cultural tourism city can be increased by more than 12%. *e study has revealed good results.


Introduction
e improvement of living standards makes the tourism industry gradually rise. Generally speaking, tourism cities have developed commercial economy, and there are many tourist attractions and classic snacks, which have attracted a large number of tourists. e development and prosperity of cultural tourism in the near future will be due to the dual role of industry consciousness and system design [1]. Promoting the consciousness, self-confidence, and self-improvement of regional cultural construction is not only the main tone of current cultural construction but also the fundamental driving force for the rapid development of cultural tourism [2]. Different cities may have the same mode and path of tourism development, but the connotation of urban cultural tourism is unique and competitive. e tourism development mode dominated by cultural tourism often has more vitality and durability. e Yangtze River Delta region is an important growth pole of China's economic development and also one of the regions with the highest level of urbanization in China [3,4]. As an important strategic base of China's tourism industry, the Yangtze River Delta region is not only the main force and vanguard of China's tourism development but also the most successful region of China's cross regional tourism cooperation and the region with great potential to become a world-class tourism destination. In the process of implementing the national strategy of regional integration in the Yangtze River Delta, cultural tourism needs to play a greater role. e Yangtze River delta needs a complete image in the process of representing China to participate in global competition and cooperation. Cultural tourism is the advance force and core backbone of regional international image publicity. At present, regional integration has become a consensus of China's future development.
rough resource sharing, complementary advantages, and market interaction, we can break the regional, spatial, and institutional barriers, create barrier-free tourism areas, and realize the all-round development of social economy. At the same time, the Yangtze River Delta region is rich in tourism resources, with a complete range of natural and cultural landscapes. It is an important concentrated distribution area of tourism resources in China and has the inherent advantages of developing regional tourism integration. erefore, it has attracted a large number of tourists to travel all the time.
Tourist flow refers to the number of people and the flow pattern of tourists from the source to the destination [5]. According to the different travel time, it can be divided into four types: daily passenger flow, monthly passenger flow, quarterly passenger flow, and annual passenger flow. Tourist flow is an important indicator of the development level of tourism industry, an important part of tourism planning by national or regional tourism authorities, an effective guarantee to improve the quality of tourism products, and an important basis for the development of tourism resources and the construction of reception facilities such as hotels. Accurate prediction of tourist flow is related to the successful operation of an international and regional tourism project, which will directly affect the scientific decision making of the tourism project and is an important part of urban tourism development planning [6,7].
ere are many factors that affect the cultural tourism passenger flow in the Yangtze River Delta, among which economy, politics, education level, resources, and transportation will have an impact on it [8]. Taking transportation as an example, the places with convenient transportation with the Yangtze River Delta will have a large tourist flow to the Yangtze River Delta. In addition, psychological factors and personal preferences will also have an impact on the tourist flow of the Yangtze River Delta. erefore, many factors must be fully considered in the process of analyzing the tourist flow in the Yangtze River Delta.
Scientific and reasonable prediction of tourist flow is of great guiding significance for the efficient use of tourism resources and local economic development. It can also help the government to formulate tourism development planning and tourism emergency plan, improve the quality and level of tourism service, and then, improve the tourist satisfaction and sense of experience [9]. e interdisciplinary research on "tourism passenger flow" and "passenger flow prediction" has developed rapidly. It has gone deep into systems science, computer science and technology, and other disciplines and has derived a number of interdisciplinary themes.
At present, the methods of predicting tourist flow are mainly divided into quantitative prediction and qualitative prediction. Qualitative prediction is generally based on qualitative analysis combined with empirical judgment [10], with low prediction accuracy. Quantitative prediction is mainly to establish a quantitative prediction model through mathematical methods. It is a widely used prediction method with high prediction accuracy. ere are three kinds of models for tourism passenger flow prediction by using quantitative prediction methods, namely, the permeability model, gravity model, and GM (1,1) model. e penetration model is an intuitive model with strong subjectivity [11], which is based on the interviewees' willingness to visit as the main data, combined with the population base and carrying coefficient to make an intuitive inference of the passenger flow of tourist attractions. However, the model has its own regional bias due to the willingness to visit, so the permeability model is only used for interval estimation of the willingness to visit and the passenger flow. e gravity model is a commonly used international method to predict passenger flow under normal conditions [12]. However, in the actual prediction process, this method only considers a single or a small number of factors affecting passenger volume, which is lack of comprehensiveness, resulting in biased passenger volume prediction. In the practical application of the GM (1, 1) model, too many factors may be considered, resulting in multicollinearity among factors [13], and the regression coefficient cannot pass the significance test, and even the sign of some regression coefficients is inconsistent with the actual economic significance.
In view of the problems existing in the abovementioned models, a prediction model of cultural tourism passenger flow in the Yangtze River Delta based on regression analysis is designed on the basis of the gravity model and permeability model. In the parameter selection of the multiple regression model, the model comprehensively considers many factors that affect the cultural tourism passenger flow in the Yangtze River Delta, so compared with the gravity model and GM (1, 1) model, it considers the factors affecting passenger flow more comprehensively, overcomes the influence of multicollinearity, and improves the accuracy of passenger volume prediction. e organization of the paper is as follows: the materials and methods of the paper are presented in Section 2 with details. Section 3 of this paper shows the results of the paper. e conclusion of this paper is given in Section 4.

Materials and Methods
e following sections briefly present the materials and methods used in this study.

Index Selection.
Compared with other traditional types of industries or industries, cultural tourism has its own unique characteristics in the development connotation, development context, and social effects [14,15]. First of all, in the world, the government vigorously promotes the cultural tourism industry, making it an increasingly important way to cultivate national cultural identity and national identity. Cultural tourism shoulders the sacred function of nurturing citizens. Secondly, the system structure and operation mechanism of cultural tourism embody the characteristics of the binary compound system of culture and tourism integration. e correct understanding of the concept of culture and the reasonable mining of tourism elements in the cultural field are the primary problems of cultural tourism transformation from resources to products, which provides an important reference for identifying the elements of cultural tourism competitiveness. irdly, cultural tourism reflects the close interaction between the host and the guest. Cultural tourism space is the combination of geographical space, cultural space, and social space of cultural tourism destination. It is a field of interaction between the host and the guest with clear geographical space, which also defines a clear spatial scope for clarifying the system of cultural tourism competitiveness. erefore, the development of theory and practice shows that the competitiveness of urban cultural tourism passenger flow in the Yangtze River Delta can carry out more scientific index screening and system construction.
Based on the theory of cultural tourism passenger flow competitiveness in the Yangtze River Delta region, the following three principles are considered and followed in the design process of the specific factor index: (i) e combination of comprehensiveness and operability: this paper selects and analyzes the connotation of cultural tourism development, the structure of tourist flow competitiveness, and the influencing factors in the Yangtze River Delta, so as to present the overall situation of tourist flow competitiveness of cultural tourism in the Yangtze River Delta as far as possible. At the same time, the feasibility and reliability of the index data sources are fully considered. According to the different grades of the Yangtze River Delta, in the design of the index system, besides quoting some necessary total indexes reflecting the scale effect, the strength indexes are considered as much as possible. (ii) Systematic and hierarchical: by fully combining the competitiveness of cultural tourism passenger flow in the Yangtze River Delta with the regional reality, the differences and emphases of cultural and tourism industry in different regions of the Yangtze River Delta are very different. In the evaluation, we should consider the problems from the actual situation of different regions as far as possible, so that the differences can be reflected in the evaluation index system. e hierarchy is reflected in that the competitiveness of cultural tourism passenger flow is mainly composed of three-level indexes. Besides the general goal and target decomposition level, it is explained and evaluated by element indexes [16]. (iii) e evaluation objectives and methods are consistent. is paper focuses on the theme of the evaluation of cultural tourism passenger flow competitiveness in the Yangtze River Delta region, designs the indexes around the center as far as possible, and always defines the direction and goal of the evaluation system. In terms of evaluation methods, after solving the basic problems through qualitative methods, we use scientific mathematical methods to calculate and screen the indexes needed by the research according to the quantitative relationship of each indicator and then carry out comprehensive measurement and evaluation.
Based on the abovementioned competitiveness theory and index system design principles, this paper decomposes the evaluation objectives of cultural tourism passenger flow competitiveness in the Yangtze River Delta region into four subobjectives: cultural tourism brand resources, cultural performance and creativity, cultural tourism support and protection, and urban tourism market income. At the same time, 28 quantitative indexes are designed to form the Yangtze River Delta region's cultural tourism. e evaluation index system of influencing factors of passenger flow is shown in Table 1.

Construction of the Prediction Model.
Before the establishment of the model, the passenger volume is y, the sample size is n, and the observation value of the i-th index is x i (i � 1, 2, . . . , 27).

Multicollinearity Diagnosis.
e dependent variable y and the independent variable x 1 , x 2 , . . . , x 27 are used to establish the regression model. SPSS statistical software is used to select the variables by the backward regression method, and p variables are set to enter the regression analysis model, which are x s1 , x s2 , . . . , x sp (p ≤ n), called initialization variables.
From the regression results to find the variance expansion factor (VIF), if VIF i ≥ 10, it means that there is a serious multicollinearity between the independent variables.
Multicollinearity has a great influence on the regression coefficient [17], which can be processed by principal component analysis.

Principal Component Analysis.
(1) Data standardization In order to eliminate the influence of different orders of magnitude and dimensions, it is necessary to standardize the original data. e standardized formula is as follows: where x ij ′ is the standardized data and x j and (j � 1, 2, . . . , n) represent the mean value and standard deviation of the j-th index sample, respectively. (2) Calculation of the correlation coefficient matrix: after processing the original data, the standardized data matrix (x ij ′ ) p×n is obtained, and the corresponding correlation coefficient matrix is calculated.
In formula (2), R is a symmetric matrix of order n. (3) e eigenvalues and eigenvectors of correlation coefficient matrix R are calculated. e eigenvalue λ i (i � 1, 2, . . . , n) of R and its corresponding eigenvector u i (i � 1, 2, . . . , n) are solved, Scientific Programming and λ 1 ≥ λ 2 ≥ · · · ≥ λ n , where λ i is the variance of the main component F i , and the greater the variance is, the greater the contribution to the total variance is [18]. (4) e contribution rate is calculated, and the principal component is determined.
Formula (3) is defined as the contribution rate of main component F i .
In formula (3), n i�1 e i is the contribution rate of cumulative variance. Generally, m principal components with n i�1 e i greater than or equal to 85% are selected for comprehensive analysis. erefore, n factors are reduced to m principal components, and the main factors are selected [19].

Principal Component Regression Model.
Multiple regression analysis is conducted between F r and dependent variable Y (standardized value of y ) to obtain the standardized regression equation.
In formula (4), B r is the standardized partial regression coefficient of the r-th principal component F r .
In formula (5), b j ′ is the j-th standardized partial regression coefficient of the standardized regression equation.
rough the abovementioned analysis, we can get the principal component regression model as follows: In the abovementioned formula, b j is the j-th partial regression coefficient of the general linear regression equation; L yy is the sum of squares of deviation of y; L x ij is the sum of squares of deviation of x sj ; y is the mean value of y; x s j is the mean value of x sj ; and b 0 is the constant of the general linear regression equation.
By substituting the observed values of each index in the forecast year into formula (6), the cultural tourism passenger flow of the Yangtze River Delta in the forecast year can be obtained.

Model Test.
e determination coefficient R 2 is used to test the model fitting. e formula for determining coefficient R 2 is as follows: In formula (9), e 2 t is the sum of squares of residuals. In the multiple linear regression model, the number of variables in each regression model may not be the same [20]. It is not appropriate to use the size of R 2 as a measure of fitting quality. erefore, the coefficient of determination R 2 of modified degrees of freedom is often used. e calculation formula is as follows: In formula (10), n is the sample size, and p is the number of regression coefficients.

Case Profile and Data Sources.
e Yangtze River Delta is an urban sprawling area with a high level of tourism development in China. It has rich cultural resources and profound cultural heritage. In view of the different regional coverage of the Yangtze River Delta, this paper takes the core area of the Yangtze River Delta as the research object in the reply of the State Council on the regional planning of the Yangtze River Delta (Guo Han (2010), No. 38), that is, taking Shanghai as the leader, and Nanjing, Suzhou, Wuxi, Changzhou, Zhenjiang, Nantong, Yangzhou, and Taizhou in Jiangsu Province, Hangzhou, Ningbo, and Huzhou in Zhejiang Province, and Jiaxing, Shaoxing, Zhoushan, and Taizhou, a total of 16 cities as both wings. is paper selects the 10-year statistical data from 2008 to 2018 as an example, and the data come from the statistical yearbooks and statistical bulletins of various cities, including the China Urban Statistical Yearbook, China Regional Economic Statistical Yearbook, China Tourism Statistical Yearbook, China Tourism Yearbook, List of Top 100 Travel Agencies, and List of National Star Hotels. Data collection adheres to the principle of combining scientificity, authority, standardization, and data availability. When the data from different channels are not unified, the higher-level government department shall prevail.

Data
Processing. Since the data collected are from the statistical yearbooks and statistical bulletins of various cities, the time span is from 2008 to 2018. Due to the snow disaster, earthquake, financial crisis, and other major events in 2008, the cultural tourism reception in the Yangtze River Delta region is seriously affected, and the number of tourists decreased significantly. In order to improve the accuracy of the prediction model, the data of cultural tourism passenger flow in the Yangtze River Delta in 2008 and 2009 are revised.
Linear interpolation is used to correct the data. Firstly, the starting year a 1 and the ending year a 2 which are suitable for linear interpolation are selected, and the passenger flow in the starting year and the ending year is expressed by y 1 and y 2 , respectively. e tolerance d is determined by formula (11).
e correction value is calculated by the interpolation equation, and the calculation formula is as follows: In formula (12), n is the year to be corrected and y n is the correction value of the n-th year.
According to the abovementioned method, the starting year of linear interpolation is 2010, and the ending year is 2013. By substituting the data of these two years into formula (11) and formula (12), we can get the revised value of cultural tourism passenger flow in the Yangtze River Delta in 2008 and 2009. Table 2 shows the revised tourist flow of cultural tourism in the Yangtze River Delta from 2008 to 2018.

Multicollinearity Diagnosis.
Taking the statistical data of the Yangtze River Delta from 2008 to 2018 as an example, this paper makes regression analysis on the cultural tourism passenger flow and 28 influencing factors in the Yangtze River Delta and uses SPSS statistical software to select variables by the backward regression method. Finally, there are 9 variables X 7 , X 8 , X 15 , X 16 , X 17 , X 18 , X 20 , X 22 , and X 23 to enter into the regression model, and the VIF values are greater than 10, which indicates that there is still a serious collinear relationship between variables. erefore, it is necessary to use principal component regression to simplify the analysis. On the premise of retaining all or most of the original information, the abovementioned interrelated variables are transformed into a few independent or Scientific Programming unrelated variables, and then, these variables are integrated to establish a regression model.
Principal component analysis is performed on X 7 , X 8 , X 15 , X 16 , X 17 , X 18 , X 20 , X 22 , and X 23 variables. Using SPSS statistical software, the eigenvalues and eigenvectors are obtained. e cumulative variance contribution rate of the first two eigenvalues has reached 91.041%. It is generally believed that the effective information can be retained when the cumulative contribution rate of principal components reaches 85%. erefore, this paper only needs to take the first two principal components to reflect most of the information of all indexes: e standardized regression equation is obtained by multiple regression analysis between the evaluation values of F 1 and F 2 and the dependent variable Y.

Model Validation.
rough the test formula of formula (8) and (9), it can get R 2 � 0.984; R 2 � 0.859, showing that formula (17) has a high degree of fit and can make a reasonable forecast of passenger volume. By substituting the values of X 7 , X 8 , X 15 X 16 , X 17 , X 18 , X 20 , X 22 , and X 23 from 2008 to 2018 into formula (17), the predicted passenger volume of each year can be obtained. e comparison between the predicted value and the actual value (Table 2) is shown in Figure 1. It can also be seen from Figure 1 that the predicted value of the principal component regression model has a good fit with the actual value, with the highest error of 1.23%, the lowest error of 0.01%, and the average error of 10 years is only 0.41%, which can basically meet the needs of the prediction of cultural tourism passenger flow in the Yangtze River Delta.

Model Performance Analysis.
In order to analyze the application performance of the model, this paper uses the method to predict the cultural tourism passenger flow in the Yangtze River Delta (Shanghai as the leader, Nanjing, Suzhou, Wuxi, Changzhou, Zhenjiang, Yangzhou, and Taizhou in Jiangsu Province and Hangzhou, Ningbo, Huzhou, Jiaxing, Shaoxing, Zhoushan, and Taizhou in Zhejiang Province, a total of 15 cities as both wings) in 2020. MAPE (average absolute percentage error, which can directly reflect the pros and cons of the prediction effect), MAE (average absolute error, the smaller the value is, the smaller the error is), RMSE (root mean square error, the smaller the value is, the smaller the error is), and EC (equalization coefficient, the higher the value is, the higher the fitting degree is) are taken as evaluation indexes to verify the performance of the model. e results are shown in Table 3. e calculation formula of each evaluation index is as follows: In the abovementioned formula, C p (t), C r (t), and V are the predicted output value, the measured value of cultural tourism passenger flow of each city in 2020, and the number of predicted samples, respectively. e prediction accuracy of the proposed model and the comparison model (the prediction model based on the gravity model and the prediction model based on GM (1, 1)) are compared fairly to the maximum extent by using the abovementioned indexes. Prediction results of this model are shown in Table 3.  Table 3 are compared with those obtained by the two comparative prediction models, and the results are shown in Table 4.
After analyzing the prediction performance of the three prediction models in Table 4, it is found that the evaluation indexes of the proposed model are significantly better than those of the two comparison models, and the MAPE is reduced by more than 50 compared with the comparison model, which indicates that this model has higher prediction accuracy compared with the comparison model. e running time (the larger the value is, the higher the complexity of the method is) is used as the evaluation index of prediction performance. e running time of this model and the two comparison models in the evaluation process is compared, and the results are shown in Table 5.
Analysis of Table 5 shows that the running time of the proposed model and the GM (1, 1) model is significantly better than that of the gravity model. e average running time of the proposed model is 0.71 s, and the average running time of the GM (1, 1) model is 0.74 s. ere is no significant difference between the two models. Combined with the data in Table 4, it can be concluded that the proposed model has a significant performance advantage compared with the comparison model.

Application Test.
is paper compares the amount of foreign exchange earned by cultural tourism cities in the Yangtze River Delta after using this model to predict the passenger flow with the amount of foreign exchange earned before using this model (the average of the previous three years is the standard value). e results are shown in Figure 2, in which the numbers 1-15 represent the cultural tourism cities in the Yangtze River Delta.
According to the analysis of Figure 2, after using the proposed model to predict the passenger flow, the amount of   foreign exchange earnings of each city shows different increases, and the increase rate is maintained at more than 12%, which shows that the economic benefits of cultural tourism in the Yangtze River Delta can be significantly improved by using this model.

Conclusions
Cultural tourism has gained much attention in the last decade and has promoted the preservation of a variety of tangible and intangible assets of culture. e applications of cultural assets for development of tourism has generated various debates such as the matter of whether the intangible values of cultural assets including those of their education, aesthetics, and history can properly be carried out in order to attract tourist. In order to accurately predict the cultural tourism passenger flow in the Yangtze River Delta and improve its economic benefits, this paper designs the prediction model of cultural tourism passenger flow in the Yangtze River Delta based on regression analysis. is paper considers many factors that affect the cultural tourism passenger flow and designs a prediction model of cultural tourism passenger flow in the Yangtze River Delta based on regression analysis. rough the experimental analysis, this model can accurately predict the cultural tourism passenger flow in the Yangtze River Delta in the process of practical application, and significantly improve the cultural tourism amount of foreign exchange in the Yangtze River Delta.     Scientific Programming Data Availability e data used to support the findings of this study are included within the article.

Conflicts of Interest
e author declares that there are no conflicts of interest regarding the publication of this paper.