Analysis of Rural Tourism Demand Characteristics and Experience Differences Based on Association Rule Mining

Whether the characteristics of rural tourism changes or not provides the scale and basis for judging whether the rural tourism landscape has changed, but it cannot provide a judgment on the impact of rural tourism landscape changes. The impact is relative to the rural tourism landscape goal. The determination of rural tourism landscape objectives provides a baseline for judging the direction and impact of rural tourism characteristics and provides a prerequisite for rural tourism landscape actions. The determination of the quality target of the rural tourism landscape is mainly determined by the internal process and external demand of the rural tourism landscape. Through in-depth research on the frequent pattern growth algorithm FP-Growth, the algorithm can find frequent item sets by not generating candidate item sets. The core of the algorithm is the frequent pattern tree FP-tree, which can efficiently compress the transaction database. Based on the advantages of FP-tree, this paper improves a FP_Apriori algorithm based on frequent pattern trees. This algorithm projects the entire transaction database onto the FP-tree, avoiding a lot of I/O overhead. At the same time, I propose a more directional and targeted search strategy for FP-tree, which reduces the running time of the algorithm and uses the principle of the Mapping_Apriori algorithm to prethin the frequent item sets. This article uses the text analysis method of network data to excavate the characteristics and internal structure of rural tourism demand. The rural tourism market has a wide range of needs and multiple levels, and traditional research methods such as questionnaires have limited sample size and sample structure. With the help of network data, text mining, and other statistical analysis methods, in-depth empirical research on the characteristics and spatial structure of rural tourism in a certain region can cover more research groups. The research confirms that the results of using text analysis and questionnaire analysis on the perception of destination image are relatively consistent. Therefore, the network text analysis method is an effective tool to study the demand structure of the rural tourism market.


Introduction
Data mining is to dig out hidden, unknown relationships, patterns, or trends that have potential value to decision makers from large-scale heterogeneous or multisource data and use these knowledge and laws to establish auxiliary decision-making or prediction models [1]. It is the process of using various analysis tools to discover models or potential relationships among massive amounts of data. Because data mining can find useful information in data and make enterprises profitable, this makes the application of data mining technology more and more common. The commonly used methods for data analysis using data mining mainly include classification, regression analysis, clustering, association rules, time series patterns, and deviation analysis, which mine data from different perspectives [2]. Among them, the association rule technology is a classic data mining method, which can discover the direct potential interrelationships of data, and this relationship does not need to be directly expressed in the database [3]. The task of association analysis is to discover the degree of association or association rules between things [4].
In recent years, some places in China have recognized the development law of local rural tourism [5]. After fully understanding the spatial laws of their rural tourism, cities have replanned, adjusted, and optimized the spatial pattern of rural tourism and proposed strategies or countermeasures for the development of rural tourism space according to local conditions [6]. Studying the spatial change of rural tourism is helpful to discover the process of agglomeration and diffusion of tourism economic activities in the region; find the core node, development axis, or development zone; and establish a regional tourism space that can integrate social, cultural, economic, and environmental resources. By constructing rural tourism centers of different levels and levels based on the centrality of the rural tourism spatial network and formulating corresponding development strategies, it is possible to coordinate the rational development of regional rural tourism and integrate tourism, leisure, specialty food production and consumption, and e-commerce [7]. Studying the spatial changes of rural tourism is helpful to develop and construct new tourism infrastructure and to selectively retain existing facilities to realize scientific rural tourism spatial planning [8]. Whether it is the transformation and construction of rural tourism consulting centers, tourism distribution centers, or rural tourism landscape tourism villages with rural characteristics, the spatial layout and planning of rural tourism are always the constraints of the construction of these centers and characteristic villages and towns, and they are the effective use of different types of centers [9].
Aiming at the improvement of the Apriori algorithm, this paper mainly proposes the FP_Apriori algorithm based on the frequent pattern tree. The improvement ideas of the algorithm are mainly reflected in three aspects. First, in view of the shortcomings of traditional algorithms that scan the transaction database multiple times, the frequent pattern tree is used to store transactions, so that only one traversal scan of the transaction database is required. Second, I optimize the complexity coefficient of the frequency calculation of the candidate item set to improve the running speed and mining efficiency of the algorithm. Third, I take advantage of the nature of frequent item sets to slim down L k−1 and remove some useless candidate item sets in advance. In addition, this paper makes a very detailed research and analysis on the improvement ideas of the FP_Apriori algorithm, analyzes the optimization strategy of FP-tree and candidate set frequency calculation, and analyzes the core steps of the algorithm. With the help of more than 20,000 online online review data on 5 online platforms from 50 major rural tourist attractions in a certain region, this paper explores the specific research content of the seasonal cycle, demand attributes, supply characteristics, and supplydemand gap of rural tourism market in a certain region. The empirical research reveals that the demand enthusiasm of the rural tourism market in a certain region has obvious dualpeak characteristics, insufficient demand upgrades, and the homogenization and simplification of tourism supply content. The overall satisfaction of the market is relatively good, the demand loss trend is obvious.

Related Work
A key problem in association rule mining is to find all frequent item sets that meet the minimum support given by the user from the transaction set [10]. This step concentrates all the calculations. The original algorithm to solve the association rule problem is the AIS algorithm. In order to improve the AIS algorithm, related scholars have proposed the OCD algorithm [11]. The OCD algorithm uses the combined information of the previous search to reduce the number of candidate item sets produced this time. Later, researchers proposed the most famous Apriori algorithm in association rule mining and its variants AprioriTid and AprioriHybrid algorithms to discover frequent item sets [12]. Since then, many scholars have proposed algorithms for discovering frequent item sets of association rules, but most of the algorithms are variants or improvements of the Apriori algorithm [13,14]. Since the Apriori algorithm is a multipass search algorithm, for a large data collection, the external memory must be read once for each search, and the I/O overhead is very high [15]. Therefore, most of the improved algorithms are making a fuss on how to reduce the number of searches. In fact, what really affects the efficiency of the classic association rule frequent item set discovery algorithm based on the Apriori algorithm is the calculation of the item set and its support. If the number of different items included in the transaction data set is n, the frequent item set discovery algorithm based on Apriori will calculate 2n item sets. When n is relatively large, a combinatorial explosion will occur. In fact, this will be an NP-hard problem.
The sampling algorithm randomly samples a part of the sample from the original data set and uses the sample to mine association rules to reduce the number of searches of the algorithm. However, because the data set often has uneven data distribution, random sampling cannot guarantee that it can be drawn at all. Although the Partition algorithm reduces the burden of I/O by separately mining the data collection partitions and finally summarizes it, it actually increases the burden of the CPU. The DIC algorithm uses a dynamic calculation strategy to reduce the number of searches to improve the efficiency of the algorithm, but there is no fundamental difference in thinking from the Apriori algorithm, and it is also a multi-pass search algorithm. These algorithms generate candidate item sets when reading in transaction data, and generate many unnecessary candidate item sets, which are computationally intensive. Especially for massive data sets, the above algorithms can only have a certain mining efficiency under high minimum support and minimum credibility or after other constraints are added. Otherwise, a combination explosion of frequent item sets will occur. The inefficiency even exceeds the storage and computing power of the machine. Because any algorithm must calculate the item set and its support, what really affects the efficiency of the algorithm is the calculation of the item set and its support. Each calculation not only takes a lot of CPU time but also involves I/O requests. Therefore, only by essentially reducing the generation of unnecessary item sets and reducing the calculation time for the support of item sets can the number of frequent item set generation and data mining time be greatly reduced.
The academic circle mainly studies the spatial distribution characteristics of rural tourism from the aspects of integrated destination management, location advantages, and resource environment [16]. For the integrated management of tourist destinations and the optimal allocation of resources, related scholars have studied the central structure of Korean rural 2 Wireless Communications and Mobile Computing space and its corresponding development strategies based on the resources of rural convenience facilities [17]. Advantageous locations or traffic arteries have led to changes in the spatial pattern of rural tourism. Researchers reviewed tourism development studies and found that rural areas on the fringe of cities strongly attract day-trip tourists, and the surrounding areas attract low-level tourists [18]. Relevant scholars have studied the relationship between high-speed rail and tourist attraction between Perpignan, France, and Barcelona, Spain, and found that high-speed rail reduces transportation costs, improves the accessibility of destinations, and enhances the spatial competitiveness between destinations [19]. It promotes the concentration of tourism activities in Barcelona, but it is detrimental to Perpignan. In addition, some scholars have studied the spatial relationship between the comfortable resources of cultural facilities in Korea, the spatial relationship between the hot tub in the Appalachian Mountains, Ohio, and the suitability of forest tourism land and found out how to influence the superior tourism resources or superior environment in the region [20]. The tourism policy and development planning strategy of the development department will affect the spatial layout of rural tourism, and so on. Research on the spatial distribution patterns of foreign scholars does not directly start from the spatial distribution or structure of rural tourism resources or scenic spots but indirect research from the hierarchical system, attractive conditions, development conditions, and population distribution of rural tourism resources or scenic spots [21]. Changes in rural tourism have an impact on the development positioning, industrial structure, and employment of the entire tourism industry. Research believes that tourism is a policy choice of local governments and a more general partial response of rural areas to the ever-changing continental and global economy [22]. As a new economic activity that replaces agriculture, tourism has changed the agricultural structure, restructured the countryside, and changed government decisions and long-term family decisions, directly affecting the organizational structure of rural communities, environmental quality, and inter-city relations. In fact, the integration effect of rural space is very obvious. Rural space is no longer purely an area related to the production of agricultural products but is regarded as a place to stimulate new social and economic activities. It usually integrates tourism, leisure, and specialty food. The large-scale development of tourism in the periphery of remote villages can promote the transfer of remote rural labor force and create tourism economic links with poor families. However, it is worth noting that tourism is a low-risk, lowreturn livelihood strategy. It can be seen that the development of rural tourism is not only a policy choice for the government in the face of global restructuring and local response but also a way of economic activity and employment problems.

The Characteristics of Rural Tourism as a
Spatial Analysis to Manage the Change of Rural Tourism Landscape Experience 3.1. The Overall Framework of the Classification of Rural Tourism Characteristics. In order to continuously promote the understanding of rural tourism landscape and the man-agement of rural tourism landscape, it is very important to establish a frame of reference that can communicate with rural tourism landscape. However, the classification process of "rural tourism landscape" is very complicated, because this object contains multiple dimensions, as well as human perception and physical reality. The rural tourism landscape is composed of components that appear in the "view field," including landforms, water bodies, vegetation, and infrastructure. These are often referred to as rural tourism landscape layers. Although it is very important to decompose the rural tourism landscape into different layers, due to the nature of the integrity and complexity of the rural tourism landscape, the interaction of these layers forms the rural tourism landscape. The rural tourism landscape as a whole operates under a nested scale and needs to be linked to the administrative management level. The DPSIR framework of rural tourism landscape changes is shown in Figure 1.
The classification system of rural tourism characteristics is a parallel structural relationship, which is a flexible framework that can be continuously expanded. The dimensional division is the choice of perspective for observing, understanding, and regulating the object of the rural tourism landscape. The selection of dimensionality is closely related to the goal, level, and scale. The more dimensions are divided, the more thorough and comprehensive the understanding of rural tourism landscape. Therefore, the classification system of rural tourism characteristics can be applied to different spatial structure levels and administrative management systems. Whether it is a transnational scale, a national or regional scale, or a local scale, a hierarchical nesting relationship is formed between different scales.
From the perspective of linguistics, corresponding to the planning process, the classification of rural tourism characteristics as a specific planning language also has three purposes or three modes. The first is the instruction model-the intuitive phenomenon that describes the characteristics of rural tourism; the second is the evaluation model that expresses subjective value judgments-the evaluation of the value and quality of the characteristics of rural tourism; the third is the prescribed mode that requires others to act-regulate the future villages' goals and methods of tourism characteristics. There is a progressive relationship among the three models, but they should be based on common problems or around common goals, such as the alienation and decline of rural and rural tourism landscapes. First, the characteristics of rural and rural tourism need to be defined, analyzed, and described. And then, you make corresponding evaluations on the quality and value of rural tourism characteristics from the perspective of human use and environmental impact. Finally, based on the goals of sustainable development, the quality objectives of rural tourism landscapes are stipulated, so as to form corresponding guiding strategies (protection, management, or planning) to protect, strengthen, or restore the characteristics of rural tourism. The basic purpose of the classification of rural tourism characteristics is to serve the planning and management of the rural tourism landscape.

Evaluation and Decision-Making of Rural Tourism
Characteristics. The evaluation stage of rural tourism 3 Wireless Communications and Mobile Computing characteristics is a relatively subjective process, which is based on human goals and values to evaluate the quality or value of rural tourism landscapes, including evaluation methods and evaluation results, which are used to assist decision-making, such as whether the development project is allowed. The evaluation of rural tourism landscape focuses on the subject of evaluation. This is obviously different from the purpose of classification and description of rural tourism characteristics. The core of the evaluation model is that there are very different values in different development stages and social backgrounds.
The evaluation of the rural tourism landscape not only includes the characteristics of rural tourism but also involves many aspects, such as the environmental capacity of the rural tourism landscape, the value of the rural tourism landscape, the sensitivity of the rural tourism landscape, and the quality of the rural tourism landscape. In specific research, the content and methods of evaluation vary with the specific   Wireless Communications and Mobile Computing problems and the goals that must be met and the objects of use. The purpose of the evaluation is to ensure that the changes in land use and the planning and design of development projects achieve a harmonious relationship with the surrounding environment, or to further strengthen or shape new rural tourism landscapes. The evaluation method should be as objective as possible, and establish clear logic and ensure its transparency. Inevitable subjective evaluation (such as the value of rural tourism landscape) requires the participation of professionals, and should be combined with the historical characteristics of the rural tourism landscape. This process should ensure the participation of stakeholders as much as possible, and the extent and nature of their participation should be clarified. The process of rural tourism feature evaluation is shown in Figure 2.
Based on the analysis and evaluation of rural tourism characteristics, the evaluation conclusions can be drawn to assist decision-making, including the determination of rural tourism landscape quality objectives for each rural tourism landscape type or region and corresponding rural tourism landscape actions. The type of evaluation conclusion is also closely related to the specific user population.
In the decision-making stage of the evaluation of rural tourism characteristics, based on the results of investigation, analysis, and evaluation, the subjective judgment of the researcher, the application method, and the opinions of stakeholders are added, that is, "who" will implement determines the development of the decision-making stage. This process will inevitably have various subjectivities, so a clear process and causality are very important. In the process of evaluation formation and the use of evaluation results, it is necessary to understand which elements are relatively objective and noncontroversial and which elements are prone to produce different opinions. Therefore, in the whole process, the participation of stakeholders and local people is also necessary and very important.

Goals and Decision-Making Forms of Rural Tourism
Landscape Governance. The protection, restoration, and strengthening of rural tourism characteristics are not intended to be an obstacle to the creation of new rural tourism landscapes. When it is concluded that certain types of rural tourism characteristics or areas are suitable for strengthening or renewal strategies based on evaluation and analysis, this means that there is room for huge changes in such rural tourism landscapes. Similarly, many degraded and declining rural tourism landscapes also urgently need active rural tourism landscape reconstruction to improve environmental quality and people's quality of life, such as old industrial areas, wetlands, and swamp areas that need to be rebuilt and renewed. The analysis of rural tourism characteristics plays a key role in determining the rural tourism landscape areas with the potential for renewal and promotion. They can assist the restoration of the lost valuable rural tourism features and the investigation of the creation of new rural tourism landscapes.
The method of making judgments based on characteristics will vary with the transaction to be determined and the specific situation that must be implemented. There must be a clear logic behind the method of goal judgment. It is necessary to carefully consider the overall characteristics and key characteristics of the rural tourism landscape, its history and origin, recent changes, changing trends, and future driving forces. There may be the creation of new rural tourism. This will help realize rational judgments about future changes in existing features.
Development control also needs to integrate the characteristics of rural tourism. The purpose is to ensure that the  Figure 2: Process diagram of rural tourism feature evaluation. 5 Wireless Communications and Mobile Computing decision-making can combine the characteristics of rural tourism as much as possible. The feasibility of the development project mainly depends on whether the development behavior will have a negative impact on the rural tourism landscape in the area. When relevant agencies adopt this method, they must formulate effective evaluation criteria (based on the key rural tourism characteristics and the impact analysis of the development-sensitive rural tourism characteristics) to judge it. Development must meet the characteristics of the rural tourism policy objectives of the region, the developer knows how to obtain the permit, and the personnel of the relevant agencies know how to review the development plan and finally form a high-quality development and construction.
The method of rural tourism landscape strategy is mainly aimed at how the rural tourism landscape changes, using the assessment of rural tourism characteristics as the spatial framework to determine the development strategies (strengthening, maintaining, restoring, etc.) of different regions. The coordination and judgment processes supporting the formation of these strategies need to be clear and transparent, usually involving subjects with different values and different rural tourism landscape objectives, and are closely related to the rural tourism landscape quality objectives. For example, in LEED's rural tourism landscape strategy, different development strategies are determined based on the characteristic areas of rural tourism, such as protection, restoration or strengthening strategies. It can be seen that different from the single protection method in the past, the document provides more flexible strategies to guide different Regional development and changes. In the rural tourism landscape policy division of Staffordshire, the development strategy is further upgraded to a policy division. The policy objectives include five categories: innovative renewal, restoration, strengthening, maintenance, active protection, and areas characterized by rural tourism. The smallest unit of "Land Description Unit" (LDU) is the spatial framework to implement various policy objectives into specific spatial scope, as the basis for future judgment and decision-making.
Environmental impact assessment or strategic environmental assessment for different types of development is an important part of the development and construction assessment of the European Union and England. Rural tourism characteristics and visual impact assessment are the main components of the research. The rural tourism landscape department and the environmental assessment department finally form an independent report. The general principles and core of the method are similar in each case. The assessment of rural tourism landscape impact should define the type of rural tourism landscape in the area where the development project is located. It analyzes how the various elements that make up the rural tourism landscape, the aesthetic and perception aspects, the characteristics, and the key characteristics that form the characteristics are affected. It assesses the consequences (profits and losses) of not carrying out development and construction and assesses the measures and scope of mitigating the impact. Rural tourism feature assessment can guide all stages of development and construction under the framework of environmental impact assessment, so as to guide construction projects to respond better to the background of rural tourism landscape.

FP_Apriori Algorithm Based on Frequent
Pattern Tree FP-Tree 4.1. The Idea of FP_Apriori Algorithm. Through the analysis of the FP-Growth algorithm, a large transaction library can be efficiently compressed using the frequent pattern tree FP-tree, which can completely avoid the huge overhead of repeatedly scanning the transaction library, which directly and effectively solves the classic Apriori algorithm relying heavily on the I/O overhead of the system. In the traditional Apriori algorithm, we use L k−1 selfconnection to get the candidate item set C k and then scan the database to get the frequent item set L k . In this process, there are two major shortcomings: the candidate item set C k is too bloated, which will produce a particularly large number of useless candidate item sets; second, when determining the frequency of the candidate item set, the entire database needs to be repeatedly traversed and scanned. The operating efficiency is very poor. We can also thin the frequent item set L k−1 in advance according to the pruning strategy, and optimize the frequency calculation process of the generated candidate item set. The specific optimization strategy will be studied in detail below.
Combining the above improvement ideas can effectively solve and overcome the shortcomings of the traditional Apriori algorithm, thereby further improving the execution efficiency and applicability of the algorithm, and can also serve as a reference for the improvement of other association rule algorithms.

Frequent Pattern
Tree FP-Tree. First, I give the relevant definition of the frequent pattern tree FP-tree. Definition 1. Frequent pattern tree FP-tree is composed of a root node (null), a group of root node item prefix subtrees, and an item header table.
Definition 2. All nodes of the item prefix subtree are composed of three parts: item name, node count, and node chain. Among them, the count of the node represents the number of transactions that contain the node in the database, and the node chain points to the next node with the same item name in the frequent pattern tree. If there is no next node, it is empty. In particular, in the FP_Apriori algorithm, some improvements need to be made to FP-tree; that is, all nodes of the item prefix subtree also need to record the name of the transaction that contains the node. Definition 3. The item header table mainly contains two attributes: the item name and the node chain head. The node chain head is the node that points to the first item in the FPtree.
I generate a bit matrix containing only 0 and 1 by scanning the transaction library. In this matrix, each row represents a transaction T i , and each column represents an item 6 Wireless Communications and Mobile Computing set I j , where 0 and 1, respectively, indicate whether the item set I j is in appears in this transaction T i . Suppose the transaction set is The collection of all item sets is The specific definition of the matrix is Insert a column into the matrix M to record the number of transactions with the same item set, denoted as RC; if there are no duplicate transactions in the transaction database, set this value to 1. In this way, the order of the matrix is reduced very effectively. The specific definition of RC is The support of all 1st-order item sets is the sum of nonzero elements in each column, which is specifically defined as The support of all k-th order item sets can be calculated by RC and T ij , which is specifically defined as According to the above definition, the specific algorithm for constructing the FP-tree can be given as shown in Figure 3.

Optimization Strategy for Frequency Calculation of
Candidate Item Sets. In the improved algorithm FP_Apriori in this article, first, you scan the entire transaction database to get the first-order frequent item set L 1 and its corresponding support and transaction identification T id and then continuously iterate through L 1 to generate L 2 , L 3 ,…, L k .
After self-connection through L 1 , the second-order candidate set C 2 ðx, yÞ is obtained, where x and y are both items in the second-order candidate set C 2 . At this time, before searching the frequent pattern tree FP-tree to calculate the candidate item set frequency, I obtain the item with less sup-port between x and y according to L 1 . In this way, calculating the candidate item set frequency only needs to scan the transaction database. Specific transactions do not need to scan the entire database; in other words, we only need to search for specific branches of the FP-tree to calculate the frequency of candidate item sets. Similarly, I generate C 3 ðx , y, zÞ through L 2 self-connection; first, we find the item with the least support of x, y, and z. At this time, you only need to search for the specific branch of the FP-tree to calculate the C 3 ðx, y, zÞ support degree.
Through the above optimization strategy, the support degree of the candidate item set can be calculated faster and more directly, and to a certain extent, the execution efficiency of the algorithm can be greatly improved.

FP_Apriori Algorithm
Analysis. The FP_Apriori algorithm is a comprehensive manifestation of the traditional Apriori algorithm and the frequent pattern tree FP-tree. Transplanting FP-tree to the Apriori algorithm can effectively improve the execution efficiency of the algorithm. The specific improvement work of FP_Apriori is mainly reflected as follows: (1) The FP_Apriori algorithm uses FP-tree to compress the transaction database, which can specifically solve the problems of the traditional algorithm scanning the database too many times, the I/O burden being too heavy, and the efficiency being too low (2) It adopts the strategy of prethinning L k−1 by using the Mapping_Apriori algorithm. The C k generated by the self-connection of L k−1 of this strategy is a necessary condition for frequent item sets. The frequency of the first i items in any item set of L k−1 is at least k − i. In the same way, the FP_Apriori algorithm can also thin L 1 in advance to avoid the generation of useless candidate items

Experiment and Analysis
5.1. Data Source. This paper collects network comment data of 50 rural tourist spots as basic research data. Considering that the rural tourism market mainly comes from the local area, and the O2O platform for local life is also an important channel for residents to purchase and evaluate leisure experience products, this study added Dianping.com as a network data source website. Dianping.com includes smalland medium-sized rural tourist spots that are not covered by other large-scale tourism websites. The emphasis of the word-of-mouth content of each website is different, which can ensure the richness and completeness of the information source. The information comparison of the five text source websites is shown in Figure 4.

Data
Characteristics. I read through all the samples to screen the samples one by one and delete duplicate and irrelevant information. The word-of-mouth evaluation needs to include the comprehensive experience evaluation of the project; that is, it needs to include the tourist's travel itinerary in rural tourism, the same tourist group, service quality, environmental quality, accommodation conditions, transportation accessibility, product evaluation, willingness to revisit, etc. After screening and cleaning, there are 19,916 remaining. Flower port and urban vegetable garden in a certain area are national 4A-level scenic spots. The distribution of online text data among scenic spots indirectly supports the problem of market concentration in the rural tourism industry, that is, the high market concentration of farm products. Although there are a large number of rural tourism projects in a certain area, the well-known farms (villas) are the main force for receiving rural tourists. The number of reviews in each month is shown in Figure 5.

Input transaction library
Set minimum support The first traversal scans the transaction database D  Figure 3: Algorithm for generating frequent pattern tree FP-tree. Collecting online travel word-of-mouth from multiple destinations for text analysis and cross-comparative analysis can not only refine the market demand characteristics of tourists but also study the supply differences between destinations and find the gap between destination supply and demand. This research will use about 20,000 online review data from 50 major rural tourist spots in a certain area on 5 major online platforms and use text analysis methods, statistical testing methods, cluster analysis methods, common network analysis, and other statistical methods for analysis. We combine the research on the characteristics of rural tourism demand in a certain area with the characteristics of tourism supply of rural tourism projects to find the direction of the   Although there is a time lag between the travel time and the review time of tourists, the time distribution of the network review data of scenic spots can basically reflect the travel situation of the market in different time periods. The number of online reviews of major rural tourism projects in a certain area has two obvious peak periods, and there are differences in geographical space, reflecting the seasonal cycle of the industry. As shown in Figure 6, September of the year is the peak periods for online reviews. After calculation, the monthly average value of the number of reviews in scenic spots is 1,660, and the coefficient of variation is 0.204. The number of reviews in September has greatly exceeded the average. Based on the specific comments, it can be seen that there is a concentrated period of outings for residents in a certain area, flower viewing during festivals, and citizens going to the suburbs to sweep graves. The number of reviews in the remaining months is more evenly distributed, which is the result of multiple factors such as the repetitiveness of rural tourism demand, the convenience of urban transportation, and the convenience of online platforms. A stable passenger flow helps to maintain the balance of rural tourism supply.
As branded rural tourism projects have a high degree of popularity, I will further examine the relationship between the popularity of reviews and whether scenic spots have received relevant titles, that is, the relationship between the popularity of scenic spots and whether they are scenic spots above 3A or demonstration sites for leisure agriculture and rural tourism. I mark the scenic spots that belong to at least one of these types as "1," and mark them as "0" if they are neither, and perform an independent sample t test. The test results are shown in Table 1.
Levene's test confirms that the variances are homogeneous, and "assuming the variances are equal" is used. The P value of the two-tailed significance level of the independent sample t test is 0.094, which is significant at the 10% level. The test results show that the average monthly online reviews of scenic spots above 3A and demonstration sites for leisure agriculture and rural tourism are higher than those of rural tourism projects that have not obtained the above titles. These projects have a relatively high level of reception, which also confirms the market utility and brand utility of various standardized demonstration sites.

Demand Experience Feature Extraction Experiment.
By extracting the basic commonality of tourism demand and experience of each project, the demand attribute of rural tourism industry is analyzed. The word-of-mouth documents of each scenic spot are processed one by one, and the software is used to extract the high-frequency feature words of the text documents of each scenic spot. In order to ensure the completeness of the travel experience summary and the accuracy of the description, according to the attribute classification of nouns, verbs, and adjectives, the top ten words of the three types of words in the document are extracted, totaling 30 words. I check the high-frequency feature words extracted from the text research files of each scenic spot one by one, remove words that do not contain tourism experience and other tourism characterization meanings, remove the vocabulary that characterizes the scenic spot name and the area, and supplement it with subsequent vocabulary. According to the frequency of the vocabulary of all 30 high-frequency words in each scenic spot, they are extracted at least sequentially until each scenic spot successfully extracts 20 high-frequency feature words. Four scenic spots were removed due to the insufficient number of comments and the inaccurate extraction of highfrequency words.
I sort out the vocabulary of the high-frequency feature words in the network review data text of the scenic spot, so as to extract the common characteristics of the tourist demand of the scenic spot. There are 99 common characteristic words in rural tourism experience, as shown in Figure 7. The characteristics of rural tourism tourist experience in a certain area and the overall characteristics of tourist demand revealed from it are basically consistent with the characteristics of rural tourism demand. It is also explained from the side that the text analysis method used to evaluate the demand attributes of rural tourism reflects high accuracy and reliability.

Market Satisfaction and Overall
Evaluation. I analyze tourists' overall satisfaction with the scenic spot based on the proportion of positive emotions in text documents and further understand the structural gap between industry supply and demand. In order to comprehensively weigh the overall satisfaction of tourists with rural tourism products in a certain area, first, the sentiment analysis method is used to calculate the proportion of positive emotions and negative emotions in the network text data of scenic spots. The positive emotions in most of the tourist spots' online review data texts are more than 70%, indicating that tourists' overall satisfaction with the rural tourism experience is relatively high. Considering the small number of words in a single online review data, about 30% of the negative sentiment proportion in the online review data also indicates that there is still room for improvement in the product quality and service quality of rural tourism in a certain area. Due to the large differences in the amount of network data information in different scenic spots, in order to more objectively reflect the differentiated characteristics of different scenic spots, the proportions of positive emotions in Internet word-ofmouth with different feature words are compared here. As shown in Figure 8, the FP_Apriori algorithm performs better than the Apriori algorithm in the proportion of positive emotions.

Conclusion
This paper improves an improved Apriori algorithm based on the frequent pattern tree FP-tree. The idea of this improved algorithm is to combine the traditional Apriori algorithm with FP-tree, which can operate on transactions, compress them efficiently, and avoid a large number of repeated transaction database traversals and scans. A new transaction database scanning strategy is proposed, which can scan only specific databases, making the scanning more purposeful and pertinent, and avoiding a lot of useless operations. The empirical research concludes that basic tourism consumption such as catering is dominant as a rural tourism demand motive, and demand motives such as leisure vacation and in-depth experience are insufficient. Tourists describe the experience content more than their spiritual perception, reflecting the slower and slower demand escalation. The overall satisfaction of rural tourists is relatively high, but they are also more dissatisfied with the level of service, the level of reception facilities, and tourism information services. Generally speaking, there are still big problems in the visit rate, popularity, and reputation of most rural tourist destinations in a certain area. In addition, the use of text analysis methods to extract rural tourism needs and experience characteristics has high accuracy. The travel motives, consumption characteristics, and organizational characteristics of rural tourism demand in a certain area extracted by the text analysis method are basically consistent with the rural tourism demand characteristics summarized in the literature review. The above empirical analysis reveals the structural contradictions within and between the two main bodies of rural tourism supply and demand. The supply of rural tourism urgently needs to innovate products and services to provide tourists with rural tourism experience products that match the quality and price and can lead the trend. The optimization of the rural tourism market structure also  requires the establishment of a benign interaction mechanism between industrial supply and industrial demand, leading demand upgrading through supply innovation, and promoting supply renewal through demand upgrading, forming a continuous driving force for the overall progress of the industry.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The author does not have any possible conflicts of interest.