Integration and Optimization of Ancient Literature Information Resources Based on Big Data Technology

Big data refers to a collection of data that cannot be captured, managed, and processed with conventional software tools within a certain time frame. It is a massive, high-volume, high-volume data that requires new processing models to have stronger decisionmaking power, insight and discovery, process optimization capabilities, growth rate, and diversified information assets. -is article aims to study the integration and optimization of ancient literature information resources of big data technology, that is, to integrate and optimize ancient literature information resources through big data technology and make the literature more systematic and complete, allowing readers to find and browse literature more conveniently. -is paper focuses on the literary works and the related collation, annotation, and textual research results and divides the scope of each subtopic according to the genre.-e biggest difference between the information platform built in this paper and the existing ancient books database is that it has the functions of semantic analysis, subject retrieval, data generation, and so on. After text learning, the computer can automatically classify related vocabulary. Based on the effective integration of big data and cultural resources, the experimental results of this article show that, so far, through technical optimization and resource integration, the number of ancient literature reincorporated has exceeded 12,000 copies, and more than 10,000 publications have been restored.-erefore, big data technology is essential for the integration and optimization of cultural resources.


Introduction
With the development of informatization, a large number of traditional paper literature resources are made into digital resources and stored in the library database, especially the precious classical literature resources. Due to the characteristics of a large number of such ancient literature information resources in the library and the dilemma that they cannot be effectively used, a data optimization technology is urgently needed to promote the effective use of information resources.
Big data technology started its application in the Internet industry earlier, gradually expanded to finance, industry, telecommunications, and other fields, produced huge industrial space and social value, and then began to enter the field of government affairs. Carrying out the integration and optimization of ancient literature information resources can improve the effective use of ancient literature information resources and promote the informatization construction of library resources. It is of great significance for promoting the academic research of classical philology and the development of universities and cities.
Big data technology has broad application prospects in the integration and optimization of ancient literature information resources. Xiong pointed out in the article that the rise of big data has brought opportunities and challenges for financial innovation. He reviewed the different fields of financial innovation based on big data, as well as the scientific discoveries and theoretical breakthroughs of risk analysis related to these financial innovations. Based on the current research status, several key issues are raised and corresponding solutions are discussed. e three averages are listed as the pricing and risk measurement of data-driven financial innovation products or services; changes that datadriven financial innovation will bring to the financial industry, including operations, resource allocation, and ecosystems; systemic based on big data analysis risk management problems and solutions [1]. De Mauro analyzed a large number of industrial and academic articles related to big data to find commonalities between the topics they deal with. He also investigated the existing definitions in order to generate a more reliable definition, which contains most of the work in this field. He also proposed a new definition for the term. [2]. is paper mainly uses big data technology to integrate and optimize ancient literature information resources. According to the data format, literature content can be divided into image, text, image and text. According to the degree of integrity, literature content can be divided into local and full text. Only by constantly refining and presenting the data resources of ancient books to readers in various forms, can we meet the needs of readers in the new environment, effectively solve the contradiction between the protection and utilization of ancient books, and make ancient documents go out of the library, incarnate into tens of millions, and serve more readers.

Research Method of Integration and
Optimization of Ancient Literature Information Resources Based on Big Data Technology e integration process of ancient literature information resources is shown in Figure 1. In order to better integrate the digital ancient literature information resources, in addition to building the management standard system and technical system, we also need to build a regulatory system to supervise the integration of ancient literature information resources, so as to ensure that every work can be carried out according to the basis. In short, we need a variety of measures to implement standard management and build a regulatory system to ensure the smooth integration of ancient literature information resources [3].

Big Data Information.
e integration of big data information has always played an important role in biological evolution. Indeed, humans and animals process external information received from different sense organs. Understanding everything and reacting to it is a process that combines information from multiple sources. e problem of multisource data merging also applies to engineering and real life [4]. When considering the construction of literature resources, many colleges and universities determine the collection scope and focus according to the professional settings of their colleges and universities and determine the number of copies according to the readers and the amount of funds. As soon as the major is updated, the collection scope and focus will change. Once the funds are not guaranteed, the construction of literature resources will become a wireless kite, so the construction of literature resources is uncertain. e more the consideration for local interests, the less mutual the benefit. For many years, the library has been discussing the interlibrary loan, coconstruction, and sharing in theory, but in fact it is still self-built and self-enjoyed, forming a literature system of different situations, small but incomplete, large and incomplete, and separated from each other. Due to the nonstandard and changeable specialty setting in colleges and universities, the literature construction system lacks systematicness. In recent years, the transformation, promotion, and merger of colleges and universities have affected the book collection system of relevant colleges and universities. e formed book collection system changes with the renewal and transformation of disciplines and majors, which makes the vast majority of university libraries lack characteristics. At the same time, we should strengthen and deepen the sharing service of traditional paper documents. Interlibrary loan and document delivery are effective means to share paper document information resources [5,6].

Ancient Cultural Resources.
In terms of the aggregation services of ancient literature information resources, compared with domestic and foreign aggregation services, the aggregation services of the two are relatively advanced, but most of the aggregation services of the two focuses on patent analysis. e patent information source in the traditional search service is relatively single. However, with the increase in user demand, the aggregation model in patent information services is also constantly evolving. e more wellknown is the Derwent Innovative Index Database, which is characterized by processing every ancient Chinese document collected by professionals and reediting it into English for indexing according to the common words and behavioral habits of technicians, thereby exposing important technologies. e subject is helpful for searchers to retrieve the required information based on subject terms, thereby improving the accuracy of patent aggregation in the subject angle; the patent search system of the State Intellectual Property Office of China also provides a variety of patent information services, which can combine patents and patents. Citations, patent legal status, and so on are interconnected and fed back to users [7]. e types and characteristics of ancient literature information resources are as follows.
(1) Primary ancient literature information resources refer to monographs, academic papers, patent specifications, and scientific and technological reports that have been published and put into use in the society on the basis of the author's own research work or research results. ey include new ideas, inventions, technologies, and achievements and the characteristics of creativity. ey can be directly used for reference. ey are the main object of retrieval and utilization.
(2) Secondary document information resources refer to the products that collate and process the primary document information resources. at is to say, a large number of scattered and disordered primary ancient literature information resources are collected by special institutions and personnel and sorted and processed according to certain methods, so as to form various catalogues, indexes (titles) and abstracts systematically, which have the characteristics of collection and retrieval. eir importance lies in providing a clue of ancient literature information resources, and they are the key to open a knowledge treasure house of ancient literature information resources. (3) Tertiary ancient literature information resources refer to a kind of regenerated information resources generated by synthesizing, analyzing, refining, and reorganizing relevant knowledge on the basis of a large number of primary and secondary ancient literature information resources according to a certain purpose and demand. For example, textbooks, reference books, and reviews have the characteristics of strong comprehensiveness, strong pertinence, good systematicness, and high utilization value.

Information Resource
Integration. Ancient literature information resources have the characteristics of multisource. e integration and optimization of information from multiple sources is related to data fusion technology. Traditional estimation theories and filtering algorithms have become an indispensable theoretical basis for data fusion technology. Over the years, the classic Basian estimation and Kalman filtering techniques have been widely used in multisource data fusion systems. e data sources of data fusion can be various, and its fusion is not a simple superposition of several kinds of data. It can often get new data that can not be provided by the original several kinds of single data. erefore, data fusion method has a wide range of practical significance. It is a way to solve the problem of unstable data integration. Probability theory, evidence theory, probability theory, and neural network are the main theoretical methods commonly used in this field in recent years [8,9]. Data selection is the first step of multisource data fusion. It is necessary to ensure the correctness of data selection and select the appropriate data objects for data fusion. If the data objects are selected incorrectly, it will directly affect the later fusion effect of multisource data.
When selecting data, we should first determine the type of data to be selected according to the purpose. ere are two types of data to be selected: remote sensing data and nonremote sensing data. e space remote sensing data and aerial remote sensing data obtained from the domestic geological and mineral work can be used as the selection objects, providing reference for regional geological and mineral survey. Data fusion is the most important part of multisource data fusion. During the operation, we can use image processing means to fuse the preprocessed data, so as to enhance the clarity of satellite image and improve its utilization value. At present, there are three kinds of data fusion technology, namely, pixel level fusion, feature level fusion, and decision level fusion. If the multisource data fusion technology is applied in the geological and mineral survey, the actual data fusion must be reasonably selected according to the specific situation, so as to avoid reducing the fusion effect due to the wrong selection and even leading to the failure of data fusion. A new method of data fusion based on cognitive learning and artificial intelligence has also emerged. Optimization technology and intelligent simulation have been combined with the traditional multidata source combination method [10]. If all documents do not contain the word, the denominator will be zero, so add 1 to the denominator to avoid this special situation, as follows [11,12]: Calculate TF-IDF as follows: e limitation of the Bayesian method is that it needs to know the relative probability under the conditional probability. However, conditional probabilities are often rare, or the probability data is incomplete. PAS can obtain the desired probability by counting related summaries. Here, the confusion degree k in PAS, k is defined as the current knowledge base and measurable characteristics used to measure as follows: Calculate TF. TF refers to the word frequency, which is used to calculate the frequency of each word in the document in the corpus after each document has been processed by word segmentation as follows [13,14]: Take a subset of the total sample without replacement to reduce the computational complexity, and the following is obtained: In the data, the bootstrap method is replaced with Monte Carlo samples with a sample size of n: In the SILQ algorithm, GiniIndex is a scalability index. Its definition is as follows: there are ra records in dataset X, and its GiniIndex definition is as follows [15]: P is the frequency of occurrence of the i-th type record in X. If the dataset X is divided into two parts, then, the following is obtained: In the iterative process, if the centroid points no longer change, terminate and output the centroid point: Calculate the classification of the sample according to the task as follows [16]:

Experimental Method.
In the era of big data, these big data analysis methods are indispensable. Among them, the most basic and essential method of big data is visualization, because the use of visualization while querying ancient literature can more intuitively display data and quickly obtain the data users need. e data mining algorithm uses clustering, segmentation, outlier analysis, and other algorithms, which allows going deep into a large number of ancient literature data and has a high mining value. Semantic engine can intelligently extract information from various documents. In addition, the era of big data also produces a large number of unstructured data and other complex data types. erefore, libraries need to use big data analysis tools to analyze, process, and mine the value of data [17,18].
(1) First, the application of visual analysis method in the integration of ancient literature information resources. e visualization analysis method makes the analysis process and results interact with users through data cube, trend graph, tag cloud, and other graphical icons, which is convenient for users to customize processing tasks and understand mining results. For example, limit the number of nodes in the social graph, display the high weight nodes specified by the user, simplify the visual graph to meet the user's requirements, and help the user understand [19]. (2) Second, the application of semantic analysis in the integration of ancient literature information resources. At present, the retrieval technology of ancient literature in China mainly uses the entry segmentation technology to retrieve keywords, metadata, or literature structure. Semantic analysis method is mainly used in the retrieval of archival information resources, which can actively obtain the corresponding information from the data, make up for the shortcomings of traditional archival information retrieval technology, and improve the recall and precision of literature [20,21].

Sample Collection.
is paper focuses on the literary works and the related collation, annotation, and textual research results and divides the scope of each subtopic according to the genre. e first subproject is responsible for the collation, mining, and database construction of the wisdom data. e second subproject is responsible for the collation, mining, and database construction of the wisdom data. e third subproject is responsible for the collation, mining, and database construction of wisdom data of novels. e fourth subproject is responsible for the collation, mining, and database construction. After selecting the data source, data entry is performed. Data template must be determined before data entry [22]. e data templates to be determined in this project are divided into three categories: first, text indexing data templates; second, interpretation data templates; third, introducing other existing data standards. Due to the inevitable errors in the process of manual input and computer identification, data cleaning should be carried out before the final data is put into storage. Firstly, the self-designed tool software is used to clean up the objective data errors, which are verified and revised manually. en, the person in charge of each subproject checks and reviews the subjective data to ensure the accuracy of the data [23]. e fifth subtopic is divided into four steps. First of all, according to the definition data of input construction, the intelligent definition database is constructed. Second, the text database is constructed according to the text data. ird, the intelligent interpretation database and text database are associated to build the intelligent reading platform. Finally, after the platform is completed, the mobile application software will be developed, so that the mobile users can also use it normally and can timely feedback their opinions and individual needs [24].

Statistics.
e biggest difference between the information platform built in this paper and the existing ancient books database is that it has the functions of semantic analysis, subject retrieval, data generation, and so on. After text learning, the computer can automatically classify related vocabulary. For example, the moon image in classical poetry has many pronouns, such as "chanjuan," "yutu," "guipo," "yupan," "yugou," "yujing," "chanpo," "binglun," and so on. It is difficult to compare and analyze a large number of works involving the image of the moon. With the help of computer, we can easily extract all the works related to the moon image in the poems and Fu of the Han, Wei, and Six Dynasties and then analyze its meaning evolution from the diachronic perspective and its emotional connotation from the synchronic perspective [25]. e data retrieved by this platform can be automatically processed to generate new documents. e existing database of ancient books is static and can only be searched in full text. It cannot change the structure of the original data, and cannot automatically reorganize the original data and information to generate new knowledge information. e platform is dynamic and can automatically generate new knowledge modules according to the needs and instructions of users. e first is the integration of literature research materials, by integrating massive, scattered, and solid paper documents, such as works on the chronology of writers' works, notes on the chronology of works, materials on literary criticism. e second is the integration of classics and history [26]. It can not only query and retrieve time, place, name, place name, name and related subject literature but also generate data. Information can be arbitrarily associated to generate new knowledge, discover new problems, locate time and space, and make data statistics. Whether it is literature research, text analysis, or data statistics, we can use the information and data of the platform efficiently and quickly [27,28].

Summary of Research Results.
According to users' information needs and consulting topics, it provides relevant bibliography and reference books and periodicals and provides a convenient way to consult literature and solve difficult problems. Efforts should be made to cultivate users' enthusiasm and interest in the use of information resources, expand the social influence of ancient documents in the library, enhance the attraction of ancient documents to readers and information users, and socialize the information service of ancient documents. At the same time, according to the actual situation of social information users, various information retrieval knowledge training courses are held irregularly, so as to improve the scientific and technological quality of users, cultivate their information query and digestion ability, and improve the efficiency of information service. In the era of big data, the coexistence of dynamic and interactive makes it difficult to accurately control data in real time. e network has built an interactive bridge between the ancient literature library and the users. e ancient literature library can transmit information to the users through the website, and the public can participate in the interaction, thus making the information flow in two directions. In the era of big data, how to adapt to the needs of the era of big data and provide big data for academic research is a practical problem in the field of classical literature research. Classical literature has never been used for statistical data. At present, all kinds of popular digital ancient books databases, such as Zhonghua Book Company's classic ancient books database, China basic ancient books database, and electronic version of Sikuquanshu, are only used for keyword retrieval, but not for data generation and statistics.
is project transforms the massive, scattered, and static paper literature into a living multifunctional database, which can be retrieved, counted, generated, and combined from multiple perspectives. It is not only the large-scale integration of research results, but also the integration and expansion of functions. is is a great innovation in the way of thinking and research methods. In order to give full play to the academic value and use function of the research results, a new direction of synchronous digitization of the research results is developed for the future literature research.

Algorithm Analysis of Big Data Technology and Resource
Optimization.
e cumulative number is shown in Table 1. In the process of the integration of ancient literature information resources, big data technology can span systems, platforms, and data structures and make the various steps of information integration smoothly cooperate. At the same time, due to the use of big data technology, data acquisition, processing, and analysis response time will be greatly reduced, which not only helps to improve work efficiency but also can significantly reduce operating expenses. In the construction of information resources integration, feedback and evaluation are essential links in the construction cycle.
rough the evaluation, we can know the utilization of literature resources, and feedback can modify the resource system and improve its service based on the evaluation.
In the era of big data, the traditional ancient literature management mode has the disadvantages of slow query speed and low utilization rate, which can not meet the needs of the development of the times. It is urgent to establish a digital modern ancient literature management system. Because the digital ancient literature management system has the advantages of cost saving and high efficiency, it can adapt to the needs of the development of the times. e document information resource system is shown in Figure 2. Various types of databases will provide more functions in addition to the most basic interpretation functions. For example, the place-name database can not only provide readers with an Mobile Information Systems intuitive impression of the location of the location but also generate a visual travel route of the writer based on the writer's deeds when reading documents such as the writer's biography.
e platform is based on the principles of openness, sharing, collaboration, and service and is used by users of different levels and types. Users can upload data, add data, and give feedback. You can manually enter or paste related documents, automatic indexing, and automatic annotation of time works. e platform will also develop crowdsourcing functions, and people of insight at home and abroad are welcome to participate in the construction of data to improve the accuracy of the database. e structure is shown in Figure 3. e whole process of document information resources from production, transmission, and distribution to development and utilization is a very complex system engineering, which involves multiple stakeholders in different regions and industries. Each interest subject is not only connected with other interest subjects but also keeps relatively independent and has its own responsibilities. e allocation of information resources is bound to involve the distribution of interests among various stakeholders. erefore, we must stand at the height of the whole society, scientifically coordinate the relationship between various stakeholders, make overall planning and reasonable layout of the province's literature and information resources, try our best to maximize the interests of all parties, and strive to achieve the "Pareto optimization" of the allocation of literature and information resources.

Conclusions
In recent years, with the rapid development of information technology of literature resources and the advent of the era of big data, great changes have taken place in the utilization of literature information resources. e transformation of traditional paper document resources into data resources and the increasing number of electronic document information resources have brought new challenges to the effective use of information resources. rough the big data information fusion technology, the ancient document information resources are processed, and then the document information resources are integrated and optimized. e document information resources system is established, which improves the effective utilization of document information resources, promotes the development of library resource informatization, and provides a new direction for the development and application of big data technology and academic application of document information resources.
Data Availability e data that support the findings of this study are available from the corresponding author upon reasonable request.