Quantitative Evaluation of Big Data Development Policy: Text Data Analysis Based on Coword Network and Policy Tools

In the context of the continuous promotion of China’s big data development strategy, this paper quantitively analyses China’s existing national-level big data policies from the perspective of policy instruments and coword networks, discusses the rationality of existing policies, explores ways to improve policies, and provides a reference for the innovation of China’s big data policies. This paper carries out a quantitative textual analysis of China’s national big data policy from the perspective of policy instruments using word frequency analysis to obtain a keyword coword matrix and visualization analysis tools to obtain a coword network. This paper further studies the network node characteristics and structure using social network analysis methods, including degree centrality, clustering analysis, and multidimensional scale analysis, to identify the policy structure and characteristics. Improving big data policy requires improvements in policy instruments on the supply side, resolving existing policy gaps, and strengthening coordination with other policies.


Introduction
With the development of cloud computing, the Internet of ings (IOT), mobile Internet, and other technologies, big data can uncover new knowledge from massive datasets and thus create new value in the information age [1]. Big data has a signi cant impact on economic development, scienti c and technological innovation, national security, lifestyles, and so on [2]. e United States, Japan, France, South Korea, Australia, and other countries have issued a series of policies to promote the development of big data [3]. e Chinese government has also started to formulate relevant policies related to data centres, cloud computing, and other aspects. For example, it has promulgated policies such as "Decision on Accelerating the Cultivation and Development of Strategic Emerging Industries" and "Platform for Action to Promote the Development of Big Data." ese policies position cloud computing as a key technology and key development direction for building national information infrastructure and realizing integration and innovation [4]. ey propose supporting the research, development, and industrialization of mass data storage and processing technologies to promote the development of cloud computing and Internet of ings technologies. e development of big data depends on the promotion of national policies [5]. A perfect big data policy system helps to correct market failure, prevent government failure, compensate for system failure, and rectify ethical anomie. Despite the development of big data in China, the construction of laws and regulations continues to lag. erefore, it is increasingly important to improve big data policies to ensure that China can seize the opportunity of big data development to upgrade its industrial structure and modernize government governance.
Policy documents are the material carriers of policy, and analysing policy documents helps scholars observe policy content, processes, and tools [6]. ese policy documents include planning, notices, bulletins, announcements, resolutions, speeches, and research reports. Quantitative analysis has become an important means of studying policy documents, and existing studies have been carried out based on external attributes, language elements, semantic analysis, network analysis, and other aspects. e textual elements of a policy document include the policy title, issuing institution, policy number, year of publication, and so on. Ritter and Lancaster [7] systematically searched references to IDR and EDR in policy documents to analyse policy impact. Chowdhury and Koya [8] conducted thematic analysis on key United Nations policy documents. Lee et al. [9] analysed Korea's data privacy system in the context of medical big data and performed a comparative analysis of the legal and regulatory environment for managing health and medical big data. El-Taliawi et al. [10] used bibliometric analysis and subject modelling to analyse the popularity of big data in public policies. Governments around the world are trying to accelerate social and technological change and achieve sustainable development by introducing a policy combination of technology promotion and demand pulling tools [11]. Benchimol et al. [12] reviewed various text analysis methods and introduced several text mining applications for analysing texts. Potter et al. [13] investigated policy documents related to ICT/digital communication and conducted thematic analysis and quantitative research on policy documents.
By consulting existing studies, the particularity of policy documents can be considered. For example, the relationship between the upper and lower levels of policy-issuing agencies and the standardization of policy texts are taken into account. Policy instruments are used to achieve policy objectives. Policy instrument analysis is an important analysis method in big data policy research. Most research on China's big data policy is purely qualitative, contentbased, or quantitative. ere is a lack of research on further mining text characteristics using coword analysis under the framework of policy objectives and policy tools. What policies has the Chinese government adopted to develop big data at the national level? What are the basic characteristics of these policies? What policy tools are used in China's national big data policy? How can scientific policy guide the optimization and improvement of big data development policy? From the perspective of policy tools, this paper uses content analysis and coword network analysis to quantify the text of China's national-level big data policy to discuss the rationality of this policy and ways to improve it.
First, this paper focuses on China's national big data development policy and the use of the quantitative textual analysis to systematically analyse the main content and basic characteristics of the policy. Second, this paper adopts the policy target-policy instrument framework to investigate the limitations of China's national big data policies and propose suggestions and directions for further optimization, which has certain significance as a reference for guiding the Chinese government's policies on big data development. Finally, this paper constructs a policy analysis research framework to provide research ideas for this research field and to aid in further qualitative and quantitative comprehensive research on big data policy.

Literature Review and Theoretical
Analysis Framework 2.1. Literature Review. Policy documents are the basis of big data policy analysis and the material carriers of policies, thus providing planning, governance layout, and code of conduct [14]. Scholars have conducted several studies on the big data policy environment, policy description and analysis, policy framework, policy comparative analysis, and references. Zuiderwijk and Janssen [15] constructed a comparative research framework for government data opening policies, including policy environment factors, policy content, performance indicators, and public value, and analysed the similarities and differences in the Dutch government's open data strategies at different levels of government under this framework. Bertot et al. [16] analysed the challenges of the US information policy framework in terms of big data access and dissemination, digital asset management, archiving and preservation, and privacy and security and proposed suggestions for revising the policy framework. In a country comparison study, Nugroho et al. proposed a transnational comparison framework of government data opening policies to provide a country comparison research framework for the development of open data policy [17]. Ma and Wang [18] compared the main characteristics, obstacles, progress, and effects of big data opening strategies in five countries, namely, Australia, Denmark, Spain, Britain, and the United States, and identified the obstacles and driving factors concerning the implementation of data opening policies.
In examining China's big data policy, scholars have focused on the content and learning experience of foreign governments' big data policies, and based on the results, research has gradually deepened to incorporate the specific situation of China's big data policy. e first category of these studies is policy environment analysis [18]. e second category is policy synergy analysis [19]. e third category is econometrics and the analysis of policy texts [20]. e fourth category is policy comparative analysis and policy system construction [21]. Measurement and analysis based on the policy text will help to clarify the evolution mechanism of cloud manufacturing and provide guidance on the implementation and application of cloud manufacturing [22].
In recent years, scholars have made new progress on this issue, especially with respect to research methods. Existing research on big data policy mainly adopts the perspective of policy instruments. e policy target-policy instrument research framework has been widely applied in the textual analysis of policy documents. For example, Huang et al. used the policy target-policy instrument framework to examine China's nuclear energy policies [6]. Big data policy tools are the means to achieve big data policy goals. Policy tool analysis is based on policy structural development. Scholars classify policy tools into information, authoritative, organizational, and fiscal tools [23]. Scholars in China have adopted the classification method of Roy Rothwell and Walter Zegveld and classified policy tools into the supply side, demand side, and environment side [24]. In terms of big data policy analysis methods, more achievements have been made [25]. First, research on big data policy using quantitative textual analysis methods, such as Zhou et al. [26], has adopted content analysis methods to analyse policy instruments and discuss the rationality of big data policy in China to explore the path to a perfect policy. Second, the case study method has been adopted to conduct comparative studies on big data policies. For example, Nugroho et al. adopted the case study method to compare the open data policies of the UK, the US, the Netherlands, Kenya, and Indonesia. ird, text mining technology and social network analysis have been introduced to study big data development policies from multiple perspectives, thereby extending quantitative policy analysis [27]. Coword analysis is an important method for social network analysis [28]. Leng and Han [29] took China's medical big data policy as the research object and used the social network analysis method and complex network analysis software to measure the node attributes of major policymakers. Zhang and Liu [30] evaluated the big data industrial policy, adopted a quantitative method to examine policy content, and used the PMC index model to conduct a regional quantitative evaluation of China's provincial big data industrial policy. Wang et al. [31] found that among all policy tools, demand-based policy tools have the most significant impact on innovation efficiency. Yang and Huang [32] described the development and evolution of China's AI policy based on the research framework of bibliometrics. Cao [33] built a two-dimensional analysis framework of policy text from the dimensions of policy tools and the value chain. Zhou et al. [34] conducted policy research from three perspectives: policy intensity, policy tools, and policy objectives.
Text mining techniques also contribute to quantitative research on policy documents [35]. Luhn [36] first proposed automatic classification by word frequency statistics. Maron and Kuhns [37] studied automatic classification. Hotho et al. [38] mapped words into concepts for text mining and proposed a variety of concept mapping strategies. Zhang et al. [39] collected big data from policy texts and conducted network analysis and core periphery analysis based on the keyword dataset. El Haddadi et al. [40] used text mining techniques to analyse and evaluate texts. In recent years, the application of text mining in management has increased exponentially [41]. Topic modelling is the frontier of text mining. Machine learning enables machines to learn rules from a large amount of data input through algorithms to conduct identification and judgement [42]. Weiss and Nemeczek [43] proposed an automatic text analysis method based on unsupervised latent Dirichlet assignment (LDA) topic modelling and dictionary-based emotion analysis. Xing et al. [44] used social network analysis (SNA), machine learning, text clustering, content analysis, and other methods for text mining. Gyódi et al. [45] proposed an innovative text mining method to support policy analysts in identifying, defining, and selecting problems.
Research on coword networks can be divided into two main perspectives. One stream of research adopts the level of a single policy document to build a text graph representation model to facilitate text mining, such as extracting keywords and text structure features. e other stream focuses on the set of policy documents to explore text features and themes of the document set. As China gradually proceeds with the development of big data, relevant policies are being constantly created. It is increasingly difficult to analyse the text in policy documents using traditional methods only. erefore, to compensate for the deficiencies of traditional analysis methods, this paper applies text mining technology combined with social network methods to perform big data policy analysis. With the gradual development of research on text knowledge mining, many scholars have proposed coword analysis as a method of text knowledge mining and have carried out relevant research. However, further study of the coword networks of big data policy is needed.
In summary, scholars have invested substantial effort in constructing a research framework for big data policy, analysed existing big data policies, and gradually expanded the scope of research methods for studying big data policy from policy content to quantitative research, thus providing a reference for further analysis and improvement. However, a systematic analysis of China's big data policy and the construction of a holistic theoretical research framework are lacking. Research has focused on the big data policies of a certain country or region that warrant further study. With the promotion of China's national big data strategy, China's national big data policy plays an important role in guiding and exemplifying the big data policies of local governments, and it needs to be further studied by academic researchers. In terms of research methods, previous studies have focused mainly on the description and analysis of big data policies. Most studies of big data policy have focused on qualitative normative discussions, with quantitative research being used to provide only a simple summary of the nature of the policy text, without in-depth analysis of its content and structure. With the development of social network analysis methods, big data policy documents can be effectively mined, and the use of social network analysis methods to evaluate big data policy needs to be expanded.
To address the above problems, this article performs textual quantitative analysis of China's national big data policy from the perspective of policy instruments, employs a textual analysis method and coword network analysis method by using word frequency analysis of keywords, and constructs a big data policy word network to discuss the rationality of and ways to perfect China's big data policy. [20], this paper introduces technology promotion and demand pull to establish an interpretive model based on the policy target-policy instrument matching model (PTPTMM), as shown in Figure 1. In this model, policy objectives are the starting point of a policy. In the context of policy-driven innovation, the policy objectives include technological innovation and application innovation.

Policy Tool Analysis Framework and Policy Target-Policy Instrument Matching Model. Based on the research of Fan and Tan
e establishment of policy objectives reflects the degree of policymakers' preference for technological innovation and applied innovation. When the policy goal focuses on technological development, the policy goal is technological innovation [46]. If the focus is the application of technology in products and markets, the policy goal is application innovation. After the goal is set, the policymaker will develop a policy plan and choose the corresponding Mathematical Problems in Engineering policy tool. Technology-driven policies adopt mostly supplyside policy tools, while demand-pull policies are dominated by demand-side policy tools.
From the perspective of the PTPTMM, supply-side policy tools act mainly on technological innovation, demand-side policy tools act mainly on application innovation, and environment-side policy tools indirectly affect both technological innovation and application innovation [47]. e intensity of the effect of environment-side policy tools is weaker than that of supply-side and demand-side policy tools, and they are represented by dotted lines in the model, as shown in Figure 1.

Quantitative Analysis Framework for Big Data Policy
Documents. Social network analysis can help provide an understanding of the overall network architecture and reveal the importance of nodes in the network through the study of degree centrality and network density. Clustering analysis is helpful to describe the clustering effect of nodes in the network. Multidimensional scaling analysis is helpful to analyse the attributes of nodes.
Coword analysis usually uses keyword coword analysis to analyse big data policy topics. e appearance of two keywords expressing the policy theme in the same policy document shows that there is a certain internal relationship between them. e more often these keywords appear in the same policy document, the closer the distance is and the closer the relationship is. en, based on the relationship between keywords, the structural changes to the big data policy theme are analysed. First, the number of keywords that appear together in the same policy in the big data policy set is counted, and then, a cooccurrence matrix is constructed. In fact, the similarity index is used to calculate the correlation between keywords and generate a correlation matrix. Finally, we use multivariate statistical methods to visualize the results of coword analysis, classify cooccurrence keywords, and summarize and refine the theme of big data policy to reveal the theme structure, hot topics, and development trends of the big data policy samples.
erefore, on the basis of clustering analysis, multidimensional scaling analysis, and social network analysis, this paper constructs a quantitative textual analysis framework for big data policy, as shown in Figure 2. is comprehensive analysis framework overcomes the shortcomings of the above analysis methods: clustering analysis cannot reflect the status of nodes in the network, multidimensional scale analysis cannot describe the relationship between nodes, social network analysis cannot describe the clustering effect of nodes in the network, and the attribute characteristics of nodes cannot be analysed in depth.

Quantitative Analysis of Policy Documents.
is paper examines the degree of matching between policy instruments and policy targets through coding and frequency statistics of policy instruments and policy targets for the big data development policy documents of China's central government and conducts coword analysis and network analysis of sample policy documents.
Big data policy documents reflect the thinking and behaviour of government departments in developing big data. Scholars always consider how to scientifically analyse big data policy documents. Quantitative research methods for policy documents include content analysis, bibliometrics, statistics, and methods from other disciplines. Quantitative analysis is carried out considering the content and external structural elements of policy documents to reveal public policy issues such as changes in policy theme, the choice and combination of policy tools, and the main cooperation network of the policy process. Specific analysis techniques of policy literature quantification include content analysis, coword analysis, and network analysis.
is new policy research method and paradigm can be used for descriptive  analysis of many policies. Academics could also use this method to carry out research in different fields.
In this paper, BIBEXCEL and UCINET are used for quantitative analysis [48]. BIBEXCEL is used to analyse the frequency of keywords and coword analysis. Coword analysis originated in the middle and late 1970s and belongs to a kind of content analysis. e coword analysis method counts the number of times that two words in a group appear in the same text at the same time, reflects the correlation degree between these words with this cooccurrence number, and then analyses the topic structure with the clustering method. e higher the coword intensity is (the number of times two words appear in the same text at the same time), the closer the relationship between the two words is. e number of cowords between keywords forms a coword matrix. en, we import the word frequency cooccurrence matrix into UCINET for drawing. It visually presents the results of cooccurrence analysis and divides the cooccurrence keywords into groups. Using this method, we can summarize, refine, and abstract the big data policy topics to reveal the theme structure, hot topics, and development trend of the big data policy samples.
Specifically, we repeatedly read the research samples and the corresponding policy interpretation documents and extract high-frequency keywords related to "big data" from the policy text. e sentences in the policy text are processed according to the policy theme-policy purpose-policy tool framework introduced in Section 2. e policies of the research sample are open coded and axis coded. en, the keyword frequency is determined, the core keywords are selected from each policy text keyword, and after manual processing and judgement, keywords with broad and unclear meanings are deleted to obtain the core keywords of the overall policy text. A common word matrix is generated according to the frequency of occurrence. According to the concept of the Ochiai coefficient, the correlation coefficient between any two keywords can be calculated to measure the similarity between two keywords. Degree centrality is used to measure the importance of nodes in the network.
Finally, cluster analysis is used to set the partition distance of cluster analysis and obtain the keyword aggregation set. Multidimensional scale analysis uses the idea of dimension reduction, combined with subjective rediscrimination and classification of subject categories. e research samples are classified and analysed to determine the centre and theme.

Data Sources.
e big data policy texts selected in this paper come from databases such as the website of the People's Government of China, the websites of provincial people's governments of China, the websites of the State Council and ministries of China, and the Chinese government's open information integration service platform. Searching for the keyword "big data" on the website platform, this paper obtains the original policy text and selects 20 documents with big data as the policy target after screening, as shown in Table 1  Specific policy targets correspond to links among government data governance, big data industry development, and scientific and technological innovation resources. Soares holds that big data governance is part of the broader information governance plan because of the adjustment of multiple functional objectives and formulation of policies for data optimization, privacy protection, and data realization related to big data [49]. Given the above definition, government data governance refers to government departments as the main body formulating data optimization, privacy protection, and data realization policies related to government big data by adjusting a variety of functional objectives [50,51]. e sharing of scientific and technological innovation data resources refers mainly to the coconstruction, opening, and sharing of scientific and technological innovation data resources through policy regulation and effective management systems and operation mechanisms to promote the effective integration of scientific and technological innovation data resources, improve the comprehensive data utilization, and continuously reduce the cost of scientific and  Mathematical Problems in Engineering technological innovation [52]. e big data industry chain refers to "all activities related to the generation and aggregation of big data (data source), organization and management (storage), analysis and discovery (technology), transaction, application, and derivative industry" [53].
e US National Institute of Standards and Technology subdivides the industrial chain of big data according to the process of realizing value through data. Fan and Tan constructed a framework for the industrial chain of big data [20]. Based on the above studies, the present paper constructs a framework for the development of big data and uses it as the basis of classification of policy target coding. According to the literature [50], the industrial chain of big data includes six parts: IT infrastructure, data management, data analysis, data transaction, data application, and data security. is classification framework reflects the development of big data from data to data derivatives and the evolution of big data technology innovation towards big data application innovation.

Policy Instruments.
Rothwell and Zegveld's classification criteria for policy tools on the supply side, environmental side, and demand side are adopted [24]. Coding rules are formulated on the basis of Fan and Tan and Cui et al. [20,46], as shown in Table 2.

Big Data Policy Content Analysis Unit Coding.
In the government's national-level big data policy documents, content under "Important Tasks" describes the corresponding policy targets. Content under "Policy Guarantee" or "Guarantee Measures" describes the instruments to be adopted by the government.
is paper randomly extracts parts of the "Platform for Action to Promote Big Data Development" for precoding. e precoding results show that the coding rules are operable. en, considering that statements separated by periods in the policy documents have independent meanings, this paper takes the statement as the coding unit and carries out preliminary coding [20,26]. Finally, we eliminate interference from unrelated sentences by comparing the coding content with the original policy text and reorganizing the coding content. e coding form is Policy Number-Chapter-Code Number, with a total of 60 items, as shown in Table 3.

Statistics and Analysis of Policy Targets.
According to the frequency statistics of China's national big data policy targets, IT infrastructure is the main policy target of China's national big data development policy. Text discourse accounts for 50% of all text expressions, followed by data applications, which account for 16%, and data management, which accounts for 14%. By comparison, the frequency of data analysis, data transaction, data security, and other policy objectives together account for 40%, as shown in Figure 3. us, IT infrastructure, data application, and data management are the main policy targets of China's national big data development policy.

Statistics and Analysis of Policy Tools.
e statistics for the frequency of policy instruments show that the proportion of supply-side policy instruments is 14%, that of Mathematical Problems in Engineering environment-side policy tools is 44%, and that of demandside policy tools is 42%, as shown in Figure 4. us, environment-side policy tools are the main means of developing big data at the national level in China, followed by demandside policy tools and supply-side policy tools. e ratio of demand-side to supply-side policy instruments is 3. Among secondary indicators, the top six indicators are technology application (demand-side tools), technology support (supply-side tools), technology standards (environment-side tools), regulation (environment-side tools), public services (supply-side tools), and major projects (demand-side tools), as shown by the frequency statistics in Figure 5. e distribution of secondary indicators of supplyside policy instruments used by the government is relatively average and focuses mainly on two aspects: public service    Mathematical Problems in Engineering and human support. e distribution of environment-side policy instruments is quite different. Target planning accounts for 55%, and there are no relevant policy tools related to "intellectual property." us, although the government uses more supply-side and environment-side policy tools, according to the distribution of its secondary indicators, the six types of supply-side tools are more evenly distributed, while environment-side policy tools focus mainly on regulation. Notably, demand-side policy instruments are concentrated in the two categories of "technology application" and "major projects," and there are few price subsidies, trade controls, and external contracting. In summary, first, quantitative research on the objectives and tools of big data development policy shows that China's national big data policy focuses mainly on strengthening infrastructure construction and promoting data application, and the status of data application is increasingly prominent. In terms of policy tools, the Chinese government's national big data policy focuses mainly on demand-side and environment-side policy tools, especially the latter, with limited attention given to supply-side policy tools. Second, the results of the analysis of the matching degree model of policy objectives and policy tools show that the goal of China's state-level big data development policy is to promote the applied innovation of big data at the product and market levels through environment and demand-side policy tools. erefore, at present, China's national-level big data policy tools and policy objectives are well matched. China should also pay attention to the use of demand-side policy tools.

Keyword Extraction from Big Data Policy Documents.
Repeated intensive reading of the 20 policy documents and corresponding policy interpretation documents of the research sample is performed, and high-frequency keywords related to "big data" are extracted from the policy text. First, according to the policy theme-policy target-policy instrument framework established in the second section of the theoretical framework of this paper, the sentences in the policy text are processed, and the policies in the research sample are coded in an open and axial manner. Open coding is the process of assigning concepts and classifying original data into categories. en, the frequency of each keyword is determined, and the core keywords are selected from each policy text. After manual processing and judgement, keywords with a broad or an unclear meaning are deleted, and the core keywords of the overall policy text are obtained, as shown in Table 4. For example, the frequency of "big data" in policy 12 is 201. In this paper, absolute word frequency notation is adopted, and the value of the corresponding component in the text vector is the number of times the keyword appears in the policy text. First, the high-frequency keywords used in the coword analysis are selected, and then the low-frequency terms are removed. e threshold value for eliminating low-frequency words is determined by the high-frequency and low-frequency word boundary dividing formula put forward by Donohue in 1973 according to Zipf's second law, as shown in the following equation [54]: where T denotes the threshold value and I 1 denotes the number of keywords that appear only once. Table 5 lists the threshold values for excluding lowfrequency words in each big data policy text. Tables 4 and 5, the coword matrix is collated and generated according to the frequency of occurrence in each policy document. According to the concept of the Ochiai coefficient, the correlation coefficient between any two keywords can be calculated to measure the similarity between them. e corresponding coefficients in the common network are calculated according to the following equation [27]:

Construction of the Big Data Policy Coword Matrix and Similarity Matrix. Based on the high-frequency keywords in
(2) e similarity matrix is shown in Table 6:

Degree Centrality Analysis of Big Data Policy High-Frequency Keywords.
High-frequency keywords in policy texts reflect the value orientation of policies to a large extent [55]. e frequency of keyword occurrence refers to how many policy texts the keyword appears in, which reflects the popularity of the keyword. Point degree centrality refers to how many nodes corresponding to a keyword are connected with other nodes in the keyword cooccurrence network, which reflects the degree of correlation between the keyword and other keywords. Degree centrality is used to measure the importance of nodes in the network. e higher the degree centrality of nodes in the network is, the more important the nodes are in the network [27]. e absolute degree centrality of the nodes in the undirected network is the sum of the number of connections of each node, which is greatly affected by network size. erefore, to eliminate the influence of network size on degree centrality, the relative centrality of nodes can be obtained by dividing the sum of the numbers of node links by the maximum possible number of nodes. e degree centrality value of keywords in the big data policy text is shown in Table 7.
As shown in Table 7, "security," "innovation," "industry," and other keywords are considerably more important in the network than other keywords. "Strategy," "data," and other keywords in the network have approximately the same degree of importance. A visualization of the keyword network and its degree centrality is provided in Figure 6. e value orientation of China's big data policy is as follows. (1) First, security is an important premise. e frequency of the keyword "safety" is very high, ranking first, with a frequency of 10, and the corresponding point degree centre is also the highest. is conclusion is consistent with Zhang et al. [55]. On February 27, 2014, China's government established the central leading group on network security and informatization. e network security law promulgated on November 7, 2016, emphasizes that China's informatization work and big data development must be based on security guarantees. (2) Second, data fusion is a means of development, as reflected by keywords such as "public," "resources," "data," "Internet," and "network." ese corresponding points are also very central. Data technology can realize data fusion and opening through information system integration, intensive construction, and overall management across regions and departments. (3) ird, industry convergence is the development path, which is reflected by keywords such as "industry" and "market." (4) Fourth, as a development idea, technological innovation manifests in keywords such as "technology" and "innovation." Technological innovation helps promote the development of the big data industry and government data governance.   [56]. According to their own data state, research objects will automatically aggregate and form a class without any prior artificial setting. e research objects of each kind of collection have similar characteristics, and there are certain differences between each kind of collection. Clustering analysis is conducted through the matrix of the Chinese big data policy keyword coword network. Based on the research content of this paper and referring to existing research [27], 5 categories of keywords are obtained. According to the keyword connotations, the five categories are system construction, industrial integration, innovation environment, technological development, and external environment.

Multidimensional Scaling Analysis of Big Data Policy.
Multidimensional scaling analysis uses the idea of dimensionality reduction to classify and analyse research samples.
Multidimensional scaling analysis is used to understand the internal relationship between samples while retaining the statistical analysis method of the original relationship between samples [55,57]. Big data policy keywords are presented in a two-dimensional atlas in the form of an aggregation. According to the connotation and extension of the keywords in China's national big data policy, starting from the first quadrant, the four sets of key words are summarized as follows: external support and overall layout, industrial development and application, infrastructure construction, and data management. Overall, the keywords of China's national big data policy are evenly distributed. Based on the above analysis, the big data policy common words network is summarized into "one centre, five themes." A core area is formed around big data with keywords such as data security, application, and innovation. e five themes are innovation environment, infrastructure construction, big data application, institutional guarantee, and industrial integration.

Conclusions
At present, the Chinese government's national big data development policy aims to promote the application of big data in industrial transformation, upgrade and strengthen the integration of big data, and improve national governance by constantly opening the data resources owned by the government. In short, from the perspective of policy objectives, the government aims to promote continuous innovation in the application of big data based on the openness and management of data sources. In terms of policy tools, central policymakers rely more on environment and demand-side policy tools. ere are significantly fewer supplyside policy tools than the other two types of tools. erefore, there is a basic match between policy tools and policy objectives. is paper draws three main conclusions. First, quantitative analysis of the goals and tools of big data development policy shows that China's national big data development policy adopts strengthening infrastructure construction and data application innovation as the policy objectives, with data application playing an increasingly prominent role. Second, in terms of policy tools, the Chinese government's national big data policy is based mainly on demand-side and environment-side policy tools and takes the them as the main means to achieve the goal. Environment-side policy tools are particularly prominent, while there are few supply-side policy tools. ird, analysis of the matching degree model of policy objectives and policy tools shows that the target-tool matching of China's state-level big data development policies is basically consistent. erefore, at present, China's national big data policy tool selection and policy objectives are well matched, but China should also pay attention to the use of demand-side policy tools.
Based on the above conclusions, this paper proposes suggestions to improve China's big data policy. First, existing national big data policy gaps must be addressed. At present, China's national big data policy focuses mainly on infrastructure construction. Although there are policies for specific industries, there are still policy gaps. erefore, the formulation of future big data policy should start by filling in policy gaps in issued policies. Second, the frequency of supply-side policy tools should be increased. e purpose of supply-side policy tools is to improve the supply factors of big data development, such as increasing financial support and human resource support, to promote the development of big data technology. is financial support is for basic research and generic technology research and development in scientific and technological innovation. Supply-side policy tools account for only 14%, leaving considerable room for improvement. erefore, future big data policymaking should improve supply-side policy.
ird, the coordinated development of big data industry development policy, scientific data management policy, and government data governance policy should be strengthened. Big data is closely related to cloud computing, the Internet of ings, mobile Internet, and artificial intelligence. e development of big data is also inseparable from data disclosure, data sharing, and data governance by the government and enterprises. erefore, big data policies should take big data as the core and coordinate big data industrial development policy, scientific data management policy, and government data governance policy to avoid overlooking or overlapping policy elements and thus wasting government resources. e research contributions of this paper are threefold. First, this paper focuses on China's national big data development policy and uses a quantitative research method to systematically analyse the main content and basic characteristics of the policy. Second, this paper uses the target-tool matching degree of the policy as the analysis framework to investigate the limitations of China's national big data development policies and provides suggestions and directions for further optimization, which has certain significance for guiding China's national government's big data policy development. Finally, this paper constructs the policy toolcoword network research framework, which provides new research ideas for quantitative comprehensive research on big data policy. e limitations of this paper are as follows. First, the process of identifying policy objectives and policy tools requires a large amount of work, and coding and classification take a long time. Second, given the workload of text analysis and the process of policy formulation and implementation in China, as well as the early stage of the development of big data policy, our greatest concern at present is national big data policy. erefore, as local governments gradually improve the big data policy system under the guidance of the central government, we can carry out further research to develop a classification matching method based on a machine learning algorithm to automatically identify the policy objectives and policy tools of big data. In addition, empirical investigations of the impact of policy objectives and policy tools on the development of the big data industry can open up a new research field.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e author declares that there are no conflicts of interest regarding the publication of this paper.