Research on the Service Mode of the University Library Based on Data Mining

In the digital information age, data mining technology is becoming more widely used in libraries for its useful impact. In the context of big data, how to efficiently mine big data, extract features, and provide users with high-quality personalized service is one of the important issues that needs to be solved in the current university library big data application. Brain computing is a kind of comprehensive processing behavior of the human brain simulated by the computer, which can comprehensively analyze a variety of information and play a very good guiding role in processing library service behavior. *is paper briefly introduces the related concepts and algorithms of data mining technology and deeply studies the classical algorithm of association rules, namely, Apriori algorithm, which analyzes the necessity and feasibility of applying data mining technology to university library management.*e design idea and functional goal of the college book intelligent recommendation system are based on the decision tree method and association rule analysis method. *rough the application research of data mining technology in the personalized service of the university library, combined with the actual work, this paper proposes data mining of association rules in the university library system. *e research further elaborates on the system architecture, data processing, mining implementation algorithms, and application of mining results. *e experimental results of the research have certain significance for the university library to explore personalized services, provide book recommendation services, and make corresponding decisions to optimize the library’s collection layout.


Introduction
e concept of data mining originated from the 11th International Conference on Artificial Intelligence held in Detroit, USA, in August 1989. At that time, the concept of knowledge discovery (KDD) was proposed, which refers to the extraction or mining of hidden information from a large amount of data. Data mining technology uses statistical and artificial intelligence technology applications to integrate various types of information data, extract a large amount of useful information from massive data, and explore the rules, thereby improving the efficiency of production and service [1]. According to the comprehensive data, the data mining analysis methods include description and visualization, that is, using visualization tools to display, analyze, and drill data, so that the data mining analysis results are more vivid and profound; classification, that is, through the preset data classification model; screening the classification data; estimating, that is, taking the collected data to obtain the value of the continuous variable through the estimation, then classifying according to a preset threshold such as 0-9, and predicting, that is, by classifying or estimating the model, thereby unknown variable prediction; correlation grouping or association rules, that is, using association rules and sequence analysis to discover the law of what is going to happen; clustering, that is, grouping records, and recording similar records in a cluster so that each group has predictive or implied features; and complex data types (text, Web, graphics, video, audio, etc.) mining. Data mining technology requires database systems to provide efficient storage, indexing, and query processing support and to use highperformance (parallel) computing techniques when dealing with massive datasets, such as distributed technology and crawling technology for rapid crawling of network information.
Compared with library development, data mining technology has developed from computer science research for more than a decade. In the middle and late 20th century, foreign scholars began to study the application of data mining technology in libraries. Domestically, with the development of the information age and the gradual accumulation of digital resources, digital libraries came into being [2]. University libraries began to introduce automated database-based management systems, and the number of databases increased dramatically. e application has gradually broadened and gradually infiltrated into the business fields of university library management and information services.
e specific contributions of this paper include the following: is paper introduces the related concepts and algorithms of data mining technology and deeply studies the Apriori algorithm, a classical algorithm of association rules is paper analyzes the necessity and feasibility of applying data mining technology to university library management e system architecture, data processing, mining algorithm, and application of mining results are described Performance analysis of the proposed algorithm and an evaluation of the algorithm with respect to other existing algorithms are given e rest of this paper is organized as follows. Section 2 discusses the basic algorithm of data mining, followed by university library personalized services discussed in Section 3. e analysis of experimental results is discussed in Section 4. Section 5 concludes the paper with summary and future research directions.

Definition of Data
Mining. At present, there are many definitions of data mining. In short, data mining is to extract or "dig" knowledge from massive data. Currently, the broad definition of data mining is as follows: data mining is the process of mining useful content from a large amount of data placed in a database, data warehouse, or other information bases. A typical data mining system generally has the following components, as shown in Figure 1.
Data mining is the integration of multidisciplinary technologies, including database technology, statistics, machine learning, pattern recognition, artificial neural networks, data visualization, knowledge extraction, image and signal processing, and spatial data analysis. Data mining systems can also integrate techniques for spatial data analysis, information extraction, image analysis, signal processing, computer graphics, economics, or psychology. rough data mining, interesting knowledge and laws implicit in massive data can be found from the database. ese laws or knowledge can be applied in business areas such as guiding decision-making, process control, sales promotion, and medical diagnosis. e data mining system can also browse and store knowledge quickly, and at the same time can facilitate our research and study. erefore, data mining is considered to be one of the most important frontier disciplines in the information industry and the most promising interdisciplinary subject in the information industry [3].

Data Mining Process.
Data mining can be understood as a process of human-computer interaction through computer processing, manual analysis, and other methods [3]. e process is complete but iterative, mainly including data preparation, data selection, data preprocessing, data mining, and transformation model and mode. e five stages of data mining are shown in Figure 2.

Decision Tree Classification
Algorithm. Decision trees have simple and efficient classification results. ey mainly reflect the influence of different attributes on the instance by constructing a tree-like form, and its leaf nodes represent the categories to which it belongs to. For a tree branch from the root to the corresponding different leaf nodes, it can be equivalent to a conjunction rule, so the decision tree is equivalent to a collection of multiple rules.
Decision trees can be divided into two different types, classification tree and regression tree, each with its own strengths. e classification tree mainly constructs a tree structure for discrete attribute variables. e main function is to mark and classify the data. e regression tree mainly constructs a tree structure for continuous attribute variables.
e main function is the value of the target variable. In general, the decision tree is for a given new data record, through its construction form to predict the category to which the record belongs to. e advantage of the decision tree is that the structure is simple, easy to understand, high in classification accuracy, and easy to optimize the overfitting of the data. e disadvantage is that the data are  2 Scientific Programming relatively easy to handle, and it is difficult to process for complex data [4].

Artificial Neural Network.
Artificial neural network originated from the characteristics of the animal neural network in biology, which is simply referred to as the neural network or connection model, and is a parallel distributed processing model. Compared with traditional artificial intelligence and information processing technology, the mechanism of the neural network is completely different, and it has the characteristics of adaptability, controllability, and multilayer training and learning. At present, neural networks are mainly used in a wide range of fields such as image processing, predictive classification, pattern recognition, automatic control, machine learning, and medical diagnosis. Artificial neural networks have predictions of the results of complex relationships, but due to the complexity of their internal structure, the results of the predictions cannot be analyzed in detail. In addition, when there are too many input neuron nodes in the input layer of the artificial neural network, after the data training, the possible prediction results are not perfect. erefore, in practical applications, a combination of decision trees and artificial neural networks can be adopted [5].

Association Rules.
e association rule mainly refers to the rule characteristics of correlation in the values of two or more variables. ere is generally an association between the data in the database, not in a single form. Correlation analysis is to discover the correlation characteristics between data through analysis, so as to obtain the dependence between data, which is convenient for future data design and analysis. e association rules are mainly composed of two stages: first, analyzing the data and obtaining the highfrequency names appearing in the dataset; secondly, performing the high-frequency names obtained in the previous step.
Applying the association rules to the personalized library management system can effectively help the library to quickly lock down the problems associated with its related issues when a problem occurs and can obtain the content of the current reader users based on the analysis of the reader's retrieval information. e process of mining information is to push the corresponding information to the reader more effectively.

Apriori Algorithm.
e Apriori algorithm was proposed by R. Agrawal et al. in 1993. is algorithm is a classical algorithm for association rule mining. Many of the later algorithms are based on the idea of this algorithm [3]. e name of the algorithm is derived from the application in the algorithm. Any nonempty subset of frequent itemsets must meet the requirements, so as long as an itemset is infrequent, its superset does not need to be tested [6]. e flowchart of the first stage of the Apriori algorithm is shown in Figure 3 [7].
e Apriori algorithm uses a recursive search idea, which uses a candidate set to find frequent itemsets layer by layer, mainly through two steps of connection and pruning. e algorithm scans the database for the first time, finds all the frequent 1-itemsets, composes the frequent 2-sets by simple merging (joining) of the frequent 1-items, and then scans the database, which will support less than the minimum support. e itemset of degrees is deleted (pruned) from the candidate frequent 2-items, and the frequent 2-items are obtained. en, the connection and pruning are used for the frequent 2-episodes, the frequent 3-episodes are found, and then iteration is performed until there are no frequent episodes higher than the minimum support [7]. e algorithm for mining frequent episodes ends, and find frequent itemsets to explore the content of the items. e Algorithm 1 is described as follows. e next step in the algorithm is to mine association rules based on frequent itemsets. A rule with a confidence greater than the minimum confidence is called a frequent association rule. e algorithm mines all the association rules. ese association rules may be frequent or infrequent.
en, based on the minimum confidence, the association rules greater than the minimum confidence are mined out to obtain the required frequent association rules.

Library Personalized Service Model.
e library personalized service system based on association rule mining (as shown in Figure 4) mainly implements two functions: Scientific Programming first, the association rule mining function, that is, the library readers borrow data to realize association rule mining, and find potential rules; personalized service function, which is to apply the generated association rules to the library personalized service [8]. e platform running on the system is based on the Windows Server 2003 operating system and adopts the B/S mode. e foreground uses Visual Studio 2005 integrated environment, and Visual C is the development tool; the server uses the background SQL Server 2005 database to save the user data; the data mining algorithm uses the Microsoft association algorithm [9]. e library personalized service system is shown in Figure 4. e library personalized service system mainly includes three functional modules: data processing, association rule mining, and personalized service (not implemented). e system first performs data processing. e main functions include data import, data integration, data cleansing, data filtering, data conversion, and data reduction. is is a very important process that directly affects the efficiency of subsequent association rule mining. en, according to the two tasks of association mining, the association model between the reader feature and the borrowed book and the association model borrowed by the reader are established. Finally, the mining association rules are applied to the reader personalized service.

e Way the Library Is Personalized.
e library personalized service is a service that provides users with information resources and functions that meet their individual requirements according to the user's information usage behaviors, habits, hobbies, characteristics, and specific needs. It is a comprehensive consideration of the reader's individual. Features and special information need to provide readers with a personalized information environment [10].
According to whether the user actively provides the demand information [11], the library personalized information service mainly has two kinds of explicit feedback methods and implicit feedback methods. e explicit feedback and the implicit feedback are mainly based on whether the user needs to provide the demand. e difference and composition are shown in Figure 5.

Application of Association Rules in Book-Borrowing Data.
e library management system is an indispensable part of the library management work. Its function is very important for the library administrators and users. erefore, the library management system should be able to provide sufficient information and quickness for managers or readers. It is generally divided into the following subsystems: the book management subsystem, the book circulation subsystem, the reader management subsystem, and the reader query subsystem. Each subsystem contains several relational tables. Among them, the book circulation subsystem is one of the most important tasks of the library. It directly deals with the readers and deals with the readers' borrowing, book return, and renewal. Data mining in this section is the development of this part of the data [12]. e task of mining the circulation data of books by using association rules is mainly to find the regularity of the two aspects by analyzing the historical data of the readers [  is has a good guiding significance for future readers to borrow.

Discover the Association between Different Items in the Transaction Database, Reflecting the Reader's Borrowing
Mode. For example, if 60% of readers borrowed book A, they would usually borrow book B. If they found the loan relationship between book A and book B, they could recommend book B to the reader who borrowed book A. Proper placement of classroom books can increase the number of loans or purchases. e KDD process is shown in Figure 6 [14]. e KDD process can be summarized into three parts: data preprocessing, data mining, and interpretation and evaluation of results (interpretation and evaluation).

Microsoft Association Algorithm.
e Microsoft correlation algorithm is very sensitive to the setting of parameters. If the parameters are not set properly, too many or too few rules will be generated. It mainly involves the following three parameters.
3.4.1. Support. Support is used to describe the frequency of occurrence of an itemset [15], and its size affects the  How to send emails to deliver information. The system tracks and collects the information columns subscribed by the user. Once the column has a new update, the related information is sent to the user's personal mailbox.
Information customization refers to the use of customized web pages is based on user-submitted needs and interest preferences and searching the resource library for information that meets their needs and interests The user does not submit his or her personality information, and the service system uses a certain mining algorithm to analyze the history records used by the user, actively acquires the user's needs and preferences, and intelligently provides relevant information.
Based on the B/S structure. Running on the server side  (2) Minimum probability means that the user is only interested in certain rules that reach the specified frequency. e setting of its value is the same as that of the minimum support (minimum support). Probability has no effect on the itemset, but has an effect on the formation of the rule. Specifying a certain minimum probability value limits the number of rules generated [12].

Importance.
It is also referred to as interest or gain in some literature studies. It has an impact on the generation of itemsets and rules, the importance of itemsets and the importance of rules. e importance of an itemset is defined using the following formula: It describes the magnitude of the influence of itemset A on itemset B. Its value range is [0, ∞]. If importance � I [16], it means that A and B are independent items, that is, purchase A and purchase B are two independent events; if import<1, it means that A and B are negatively correlated, that is, if a customer purchases A, then he purchases B which is unlikely to occur; if import> > 1, it means that A and B are positively related, that is, if a customer purchases A, he may also purchase B. e importance of the Bo rule is calculated using the following formula: From the definition of equation (4), if the value is 0, it means that A and B have no relevance; positive values mean that when A is true, the probability of B will increase; negative value means that when A is true, the probability of B will decrease [17].

Mining the Association between Reader Characteristics and Borrowed Books.
e above experiment is to mine the association rules based on the relationship between the reader's characteristics and the borrowed book class [18]. When the support degree is 0.1 and the confidence � 0.4, 186 rules are obtained. e experimental results are shown in Table 1. e association rule comparison diagram is shown in Figure 7.
By analyzing the above association rules, one can find the following rules: (1) First-year (08) computer majors borrowed 10.3% of all computer-based readers for web design books, and 12.6% of all computer-based readers borrowed multimedia directions, in fact, in their second year. At the same time, the computer system reorganized the original three computer application classes into later computer application classes, web page orientation classes, and graphic image processing classes according to the student's interest orientation, which is consistent with the results obtained by this association mining. (2) 56.2% of male students borrowed online books, and male students accounted for 68.5% of the entire computer department. e proportion is quite high, so it is possible to recommend online books to male students. (3) Female students borrowed 14.8% and 15.6% of all computer science students in the web design and graphic image processing categories, so they can be considered when personalizing services. e following minimum support degrees are 0.05, 0.1, 0.15, and 0.2, and the minimum confidence is 0.2, 0.3, 0.4, 0.5, and 0.6, which are Steps 5 and 6 in the above experiment, and the degree of confidence between the support and the rule number is obtained. e relationship table is shown in Table 2.
e relationship between minimum support, minimum confidence, and rule number is shown in Figure 8.
rough the above experiments, it is found that choosing the appropriate minimum support and minimum confidence is the key to mining effective association rules [19]. e value will affect the number of export rules and the level of the concept layer. Library readers have a large amount of data to borrow, and it is impossible to predict how much support can filter out the appropriate data. erefore, the minimum support and the minimum confidence threshold can be appropriately adjusted according to the actual number generated by the rule and the predetermined target to avoid excessive or too few rules. In addition, through the mining of association rules, it is found that the system is sensitive to support. When the support value is > 0.2, the rules cannot be mined [5].

Mining the Association between Books.
In the above experiment, one only needs to set the input column and predictable column to the book classification number and then adjust the algorithm parameters (support degree � 0.15; confidence level � 0.45). e above experiment process is repeated to get the reader to borrow books. ere are 125 association rules between them. e experimental results are shown in Table 3: e above rules are explained as follows. e first rule: 15.2% of readers borrowed database theory and system and program language, algorithm language books. At the same time, borrowing database theory and system readers has a 48.7% chance of borrowing programming language and algorithm language books; the second rule: 4% of readers borrowed image processing software and text information for book processing at the same time, and readers of image processing software have a 56% chance to borrow text message processing books; the third rule: 15.8% of readers borrowed related machine-aided design ( CAD), aided graphics, and image processing books [20]. e readers of the assisted graphics class have a 67.2% chance of borrowing image processing classes. e fourth rule: 16.2% of readers borrow computer security and network operating system books at the same time, while 52.4% of readers borrow computer network security. e fifth rule: there are 18.5% of the readers who also borrowed software maintenance and programming languages and a languagespeaking book. At the same time, readers who borrowed the software maintenance class had a 47.3% chance of borrowing programming language and algorithmic language books.
Finally, comparing the rules derived from association mining with the actual work of the college library and the readers' book-borrowing survey, the results are relatively close, indicating that the data mining results of this system are effective. However, because the number of students in the computer department is relatively small compared to the students in the whole school, most students borrow books Step 1: L1�find_frequent_l-itemsets (D) // Mining frequently 1 set, scan the transaction database Step 2;) for (k�2;Lk-1�0;k++) { Ck�apriori_gen (Lk-1,min-sup) //Call the apriori_gen method, generate //Candidate frequent k itemsets Step 3: for each transaction Ct�subset (Ck,t) for each candidate c c.count++ } //Scan the transaction database D //Statistics count the number of candidate frequent k items Step 4: Lk�{c|c.count>+min-sup} // e k-item set that satisfies the minimum support is the frequency } return l�UkLk //Merge frequent k items set (k> 0) Scientific Programming according to their own majors, the number of books in the library is limited, and the book renewal period is relatively long. It also has some influence, which leads to some limitations of the excavated association rules.

Conclusion
Based on the data mining technology in the literature, this research studies how to use the data in the library management information database, uses the Apriori algorithm to mine data such as borrowing records, and finds the reader's relevance to the borrowing of documents. Different types of readers exist which are presented in the literature. e regularity of borrowing exists, and there is a certain connection between different disciplines. Excavating the relationship between these data, the librarian can purchase the book to provide service information, which is conducive to rational allocation of the library's literature resources and improves the utilization of resources, and promote a virtuous circle of book management. Taking the book management system as an example, we introduce the system structure and business process of the university library     8 Scientific Programming management system and study how to build the data warehouse on this basis. Finally, we use the Apriori algorithm and the improved Apriori algorithm to mine the data such as borrowing records. ere are a large number of borrowing records in the database of the library. We can mine the borrowing and reading data of the readers. It can be found that the readers have certain relevance to the borrowing of the documents. Different types of readers have certain rules for borrowing documents.
ere is also a certain kind of connection between different disciplines. We can analyze the relationship between readers and books in the borrowing record and discover the relationship between these data, which can provide the library administrator with the service information. It is conducive to rational allocation of library literature resources and improves the utilization of resources. At the same time, it provides some ideas for the application research of others in this aspect. e results of the proposed study show the effectiveness of the proposed study [20].

Data Availability
e datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Disclosure
All authors agreed to submit this version and claimed that no part of this manuscript has been published or submitted elsewhere.

Conflicts of Interest
e authors declare that they have no conflicts of interest.