Resource Collection Algorithm for Entrepreneurship and Employment Education in Universities Based on Data Mining

Graduate unemployment is one of the serious challenges in China, including the graduates of a large number of public and private higher education institutions. The collection of entrepreneurial employment education resources in colleges and universities is a basic project and a key link to promote the rapid development of education informatization. Data mining has various applications in di ﬀ erent ﬁ elds such as health care, smart agriculture, smart cities, smart businesses, and education, but is playing a vital role in the ﬁ eld of education and businesses. The applications of data mining provide new technical tools and development directions to realize the common construction, sharing, and collection of entrepreneurial employment education resources in colleges and universities. The closed nature of teaching resources within colleges and universities leads to the inability of external search engines to search them, which hinders the search and access of teachers and students and seriously a ﬀ ects the smooth implementation of current innovation and entrepreneurship employment work. Aiming at the real demand of entrepreneurial employment education resource collection in colleges and universities and the characteristics of on-campus resources, this study proposes a data mining-based algorithm for entrepreneurial employment education resource collection in colleges and universities. The algorithm obtains entrepreneurial employment demands from the academic a ﬀ airs system, collects on-campus online teaching resources through internal crawlers, and provides services for teachers, students, and employees through online teaching resource collection drive subalgorithm and quick recommendation subalgorithm. We also compared the proposed model with the CLR model. The case analysis and performance experiments show that the proposed algorithm has a good resource mining e ﬀ ect, high user satisfaction, and high recommendation e ﬃ ciency, occupies fewer system resources, and shows high performance as compared to the CLR model.


Introduction
Employment has always been a major concern in the growth and economic development of societies and countries. Job provides individuals with a sense of personal purpose and pleasure, as well as the means to sustain themselves and their families, while also contributing to the economy and productivity of the country. The significance of employment cannot be ignored in the establishment of a peaceful, productive, and healthy society. For women, a job can mean financial independence; for children, a job can mean access to school and health-care services; and for jobless, young people, a job can be a viable alternative to violence. Employment is very essential for long-term growth. Furthermore, a civilized society is one where individuals may live healthy and productive lives, have access to resources required for a good quality of living, and participate in the activities of society. Jobs play a key role in the development of a society and a country.
Entrepreneurship is the process of starting a business or a series of businesses with the goal of making a profit [1]. In other words, entrepreneurship education is defined as a person's attitude and aptitude to search out investment opportunities in a given environment and to successfully develop and run a business [2]. Entrepreneurship is becoming a more significant aspect in economic growth, job creation, social inclusion, poverty alleviation, and innovation development. As a result, politicians, academician, and international organizations started to take more interest in the entrepreneurial environment [3]. The universities and business colleges are playing a critical part in the development of enterprises. Universities have long been known for their preservation, conservation, and transmission of culture, and knowledge dissemination missions; however, with the current market trend and demand of the global market, they are now organizations whose mission is to promote research findings and their translation into new goods and enterprises [4].
As of right now, the working conditions of local colleges' students are not promising, and the employment rate has risen to the top of the list of the college students, while selecting their educational institutions. The rate of employment becomes the top priority of the college students while deciding which college to choose for the higher studies. Contemporary college students are imprisoned in a sea of books and questions throughout their high school duration, in order to achieve outstanding scores. This has had a significant impact on the development of high-quality education [5][6][7][8][9]. The inability to provide high-quality education is also an important contributor to the poor employment rate, experienced by the graduated students of colleges and universities. Of them, the social environment component and the humanistic thought element stand out as particularly important and are discussed in more detail in the following subsections.

Factors
Relating to the Social Environment. It is an important and a key period of change in China's market economy, but there is a significant imbalance in the development of labor-heavy industries, the development of hightech industries, and knowledge-based businesses in the country, which indeed is a serious issue. The flourishing expansion of the market economy does not necessarily result in more and better jobs for contemporary college students, but rather in a greater and higher standard of comprehensive quality for college students, according to some perspectives [10,11]. However, due to the wide availability of higher education, the number of college graduates is increasing with each passing year, resulting in an increasingly harsh competition for employment among the new graduates and the old graduates.
In such cases, the "iron rice bowl" jobs supplied by the educational institutions have grown to be the focal point and point of interest for college students. Because of the higher salaries offered by state-owned enterprises and international corporations, these companies have been popular among college students in recent years. Insurance, banking, securities, and other businesses have higher mobility and a scarcity of human resources, but they require a high level of professional knowledge and work experience, which makes it difficult for a person to obtain such a position who has no experience and is going to start his/her professional career. Because small private enterprises offer modest salaries and have fewer opportunities for growth, they are not regarded highly by the colleges and are therefore not the first choice for college students who are looking for jobs. This situation of "not being able to get a high-paying job, but not joining a low-paying job" also causes the employment guidance work of contemporary college students to become clogged, which represents the current situation of difficult employment for college students in the USA [12].

Humanistic and Ideological Considerations.
Although social civilization continues to advance, the ideological paradigm of "all things are inferior, but only reading is superior," promoted by the traditional exam-oriented schooling, continues to have a significant impact. Furthermore, the majority of current college students is only children who have been unnecessarily spoiled by their parents, leading them to believe that, even if they do not put forth any effort, they would be able to rely on their family's money and connections to secure employment. As a result of the system of market economy, the pressure and hazards that individuals bear as a result of the process of social development are becoming increasingly intense. Consequently, the nature of work, the workplace, the remuneration, and the opportunity for personal growth become crucial considerations for college students when deciding on a career. However, when it comes to choosing a job, many college students fail to assess their own abilities and become overly selective while selecting a job. This leads to unemployment, and most of them fail in attaining the position which they desire in a short period of time. In addition, this further increases the difficulty of finding employment for college students in the talent marketplace, as the number of graduates is increasing every year [13][14][15][16][17][18].
College students' entrepreneurship has become an integral part of national entrepreneurial activities in the context of innovation and entrepreneurship. Therefore, it is critical to conduct in-depth research on innovation and entrepreneurship ability assessment, in order to assist the students in making reasonable assessments of their own innovation and entrepreneurship abilities, as well as to provide data support for innovation and entrepreneurship education in colleges and universities. The assessment of innovation and entrepreneurship ability has been studied by various foreign scholars previously, and they believe that innovation and entrepreneurship ability encompass the corresponding psychological ability. Further, they proposed a personal entrepreneurial suitability model, which contains established factors and quality factors of excellent entrepreneurs, from which the elements of innovative entrepreneurial ability are screened out (including conceptual ability, opportunism, and risk-taking). In the following steps, the evaluation structure of innovative entrepreneurial ability is developed using two different methods of self-efficacy and self-assessment, after which the creative entrepreneurial capacity of entrepreneurs is assessed. Innovation and entrepreneurial ability have also made significant advancements in domestic research. This includes investigation of the current situation of the college students' entrepreneurial ability, analyzing the connotation of innovation and entrepreneurial ability, screening several elements of innovation and entrepreneurial ability, linking college students' majors with entrepreneurial ability, and establishing a multilayer index system to measure the entrepreneurial ability. The results of the assessment are then verified using a multidimensional expandable metamodel [19]. A data mining-based innovation and entrepreneurial ability assessment system for college students are proposed on the basis of the aforementioned notion. When using statistical and intelligence retrieval methods to search for hidden information in a large amount of data, data mining play a significant part by considering the relationship between the factors of innovation and entrepreneurship ability in a comprehensive manner, resulting in the formation of a more accurate measurement system [20,21].
Students struggle to find work (job) after graduation is a serious problem, which has drawn the attention of people from all backgrounds and walks of life during the past several years. It is predicted that the total number of college graduates would reach 10 million by 2025, based on data from social survey research. Considering such a large number of graduates, colleges and universities need to provide employment guidance and a lot of employment information for college students. With the official publication of a guidance document on the precise graduation requirements for college students by the Chinese Ministry of Education, the employment guidance work for college students is facing a new and unprecedented challenge. The functionality of data mining technology in college students' career guidance is to collect the data and to integrate and process all of the data, as well as to grasp the objective laws that can be used as a supporting factor in career guidance work. Further, the usage of data mining in college students' career is to provide accurate, standardized, and personalized services by combining the basic characteristics of students at different stages, to improve the overall quality of college students' career guidance work, and to assist students in better understanding the current employment environment [22][23][24]. Further, it increases the employment rate and resolves the current employment problems of college students.
The main contributions of this study are given as follows: (i) An innovation and entrepreneurship ability measurement scale is developed to identify the strength and weakness of each ability element, and the weights of each ability element are derived through case analysis to quantitatively evaluate the factors that influence innovation and entrepreneurship ability (ii) The results of the assessment are then verified using a multidimensional expandable metamodel. A data mining-based innovation and entrepreneurial ability assessment system for college students is proposed (iii) This study proposes a data mining-based algorithm for collecting entrepreneurial employment education resources in colleges and universities. The experimental results show that the proposed algorithm can make full use of entrepreneurial employment resources inside and outside of the colleges and universities, with good collection effect and high-cost performance (iv) The proposed model is also compared with the CLR model, in terms of user satisfaction, recommended coverage, recommendation accuracy, and system resource consumption. The proposed data miningbased algorithm performed way better than the CLR model in terms of the mentioned performance measures The remaining paper is organized as follows: Section 2 represents the algorithm structure and flow of processing, Section 3 demonstrates the proposed methodology, and Section 4 illustrates the experimental results and their analysis. Finally, Section 5 concludes the overall research work.

Algorithm Structure and Processing Flow
A problem-related, well-organized, well-optimized, and a standard algorithm is very necessary for the accomplishment of a particular task. In this study, a data mining-based entrepreneurship employment education resource collection algorithm is proposed. The basic processing units of the proposed resource collection algorithm include (1) online college entrepreneurship employment education resource demand collection unit, which collects the entrepreneurship employment education resource demand information based on the API provided by the online entrepreneurship employment tools such as Zhi Lian Recruitment and BOSS Direct Recruitment; (2) entrepreneurship employment education resource collection unit, which uses data mining technology; (3) resource processing and discovery drive unit, which processes the available mined resources for entrepreneurship employment and regularly drives the crawler program to carry out discovery work; and (4) online entrepreneurship employment education resource recommendation unit, which publishes the available resources for entrepreneurship employment education to the required personnel, such as lecturers, students, and other personnel on campus who are in need of entrepreneurship employment. The following steps describe the basic processing units of the proposed resource collection algorithm in more detail.
Step 1: The online college entrepreneurship and employment education resource collection unit uses the API functions provided by the online entrepreneurship and employment tools such as Wisdom Union Recruitment and BOSS Direct Recruitment to traverse the university and other accessible online employment websites; downloads the employment job information (ppt or pptx file), innovation and entrepreneurship outlines, innovation, and entrepreneurship implementation plan (word file) from them; and extracts the keywords from them through the POI file parsing tool. The keywords are extracted by the tool and fused with the professional information of college students into the demand vector of college entrepreneurship and employment education resources, which is stored in the database for backup.
Step 2: The resource collection unit collects the available resources of each website through the web crawler program, filters the collected resources, and generates a list of commonly used resources and a list of forbidden resources.

Mobile Information Systems
The resources in the list of commonly used resources can be detected by high frequency, so as to update them at any time and provide the latest resources for teachers, students, and employees. Further, the list of forbidden resources stores the resources that are not related to entrepreneurship and employment, and the resources in it will not be accessed by the collection unit.
Step 3: The resource processing and discovery drive unit match the collected resource features with the online teaching demand vector, on the one hand, and constantly update the recommended list of resources corresponding to teachers and students. On the other hand, it continuously processes the resources in the online university entrepreneurship and employment education resource database, combines the application feedback and the latest demand of teachers and students, and discovers the resources with high demand/ potential, and submits them as the basis for discovery to the available resources in the university resource collection unit and drives it to carry out the next step of resource collection.
Step 4: The online college entrepreneurship and employment education resource recommendation unit sends the recommended list of resources to teachers and students in advance or after the class, based on the time list in the online teaching plan through the online class API interface or teaching support system (such as email, WeChat group, QQ group), or directly to the online teachers, who control the release time of the recommended resources.

Proposed Methodology
This section of the paper represents the proposed approach adopted for the conduction of this study. As mentioned in the previous section, the two most important links in the data mining-based resource collection algorithm for entrepreneurship and employment education in colleges and universities are offender and recommendation, both of which run in the resource processing and discovery drive unit. The driving link is mainly to discover the high and potential demand resources and their characteristics, so as to improve the efficiency and quality of the next resource collection. The recommendation part is to match the collected resource features with the online teaching requirements, in order to generate a list of resources for teachers and students. The main data structure of graduates is composed of six vectors (storage identifier, subject identifier, demand feature set, key feature subtable, application environment set, and demander set). Further, in this data structure, the key feature subtable stores the off-sugar features of resources and their positions in the resource management system, which is a more accurate portrayal of the resources. The format of the vector of university entrepreneurship and employment education resources is similar to that of the vector, which can be expressed as a six tuple of <vector star storage identifier, subject identifier, resource feature set, key feature subset, contextual environment set, pre-defined demander set>. Based on the above data structure, these vector sets can be regarded as online education demand feature mapping, and the related problems can be transformed into submap processing problems [25].

Online Teaching Resource Collection Drive Subalgorithm.
The main goal of the driver algorithm is to discover the high and potential demand resources and their characteristics, so as to improve the efficiency and quality of resource acquisition. With reference to the fuzzy matching theory in mapping, this study designs a fast driving matching index, which along with its basic processing units is explained as follows: where Ω is the C-dimensional teaching demand vector space composed of teaching resource demands, where the X = ½x 1 , x 2 ⋯,x C is a mutually independent online teaching demand vectors in C, which can be consider as subgraphs in the demand graph latent, while FðXÞ is the demand heat assessment luminance of X measured through the two elements of 5 and 6 of the six tuples mentioned above, and EðFðXÞÞ is the driving quadruple matching index of this online teaching resource demand. f ðX ; uÞ and gðX ; vÞ are the heat soar rate trapped in the online teaching demand space as well as the head pass bias Young rate functions, where u = ½u 1 , u 2 ⋯,u C and v = ½v 1 , v 2 ⋯,v C are the C-space vectors of both functions and the quadratic reconstruction subgraphs in the whole demand graph. When we drive f ð⋅Þ, then x j will be based on u j through nearly 0; at that time, it indicates that the demand deviates from the current demand trend, and it should not be used as a resource mining service based on the opposite 1 − u j set nearly 1, and is suitable for resource mining basis. In order to play a corrective role of managers, administrators can manually set gð⋅Þ for manual correction process and drive matching subgraph, and matching guesses can be measured by the following indicators: whereÊ½FðXÞ is the Siamese approximation of the driving matching index E½FðXÞgate and VarðÊ½FðXÞ is the variance of the formerÊ½FðXÞ. Since the importance of Russian correction for matching process has been argued in the literature, i.e., the quality efficiency of the driving matching is closely related to the effectiveness of the correction rate trapping number; a stepwise refinement operation of our model with the reconstructed subgraph gðX ; v opt Þ is performed, which can be expressed as g opt ðXÞ; i.e., fg opt ðXÞg will be based on the optimal v opt parameter-shaped ball process, and the g ′ ðXÞ of g opt ðXÞ reaches the optimization, which is quantified as the operation reaches the approximation value of g opt ðXÞ (the heat of demand), when the absolute value of the difference between the two is minimized, as 4 Mobile Information Systems shown in In order to improve the approximation speed of the subgraph matching, the average shortest distance approximation method is designed in this study, as shown in Equation (4): In Equation (5), in v ðkÞ j , the jðj = 1, 2,⋯,CÞ and the v ðkÞ are obtained at the k th cycle, and x ij is the first component in the first six tuples, obtained by theg′ðX ; v ðkÞ Þ, based on the stepwise refinement of Equation (5). The list of subgraph matching parameters v opt = ½v opt−1 , v opt−2 ,⋯,v opt−C is there to obtain the best g opt ðXÞ, by combining it with the Equation (1) in order to obtain the set of the most worried demand for driving matching. Further, the real jump founded that, although the subgraph channel set up the matching precision higher, a number of manual interventions were required for the subgraph error vector that indeed is a time-consuming task. Therefore, our model introduces a time-efficient method to obtain the most time-efficient demand vectors from the set; as mentioned earlier in the subtutorial error vectors in Equation (5) whereÊ½FðXÞ is still driving the approximation of the four matching indices E½FðXÞ; at the moment, if the value of v ðk−1Þ j is small, it indicates that the change of the relevant demand vector in the time period is not significant; conversely, the subgraph content of the demand vector in the time period changes the demand state approximation due to the arrangement of the front, so as to ensure the timeliness of the resource demand.

Online Teaching Resources Quick Recommendation
Subalgorithm. The quick recommendation subalgorithm assumes the situation of K available resources on campus, and its basic recommendation state is given as follows: where g j ðX ; v j Þðj = 1, 2,⋯,KÞ is the stepwise refinement operation and π j is the weight of g j ðX ; v j Þ available resources. Based on this, when a certain demand enters the space for matching, the set of K resources can be expressed as Z = ½Z 1 , Z 2 ,⋯,Z K , in this set Z j = ðj = f1, 2,⋯,KgÞ with the matching degree weights π = ½π 1 , π 2 ,⋯π K , and with the π j = pðZ j = 1Þ can also be interpreted as the possibility of resource j entering the current demand recommendation list, where ∑ K j=1 π j = 1, 0 ≤ π j ≤ 1. At this point, the subgraph algorithm is used to describe the weight distribution in the whole space of Z as: When the demand X seeks its highest matching position in the whole resource mapping space Z, we have At this point, the recommended resource subgraph of this demand can be obtained by combining X with Z and by substituting Equation (8) and Equation (9), so we get Eq (10) However, there are too many subgraphs, which will lead to a sharp increase in recommendation effect and time consumption, so it is very necessary to further improve the recommendations. This is done as follows: g mix ðXÞ is used as the initial recommendation list (initial recommendation resource subgraph), and the recommendation items v are first filtered based on the demand; corresponding to fv 1 , ⋯ , v K g the recommendation metrics fπ 1 , π 2 ⋯, π K g refers to Eq. (7) and Eq. (4), we can obtain the improvement operation as illustrated by Eq (11) arg min where X i ði = 1, 2,⋯,NÞ are the recommended feasible 5 Mobile Information Systems resources in the g mix ðXÞ and v 1 , v 2 , ⋯v K in g mix ðXÞ can be expressed as wðX i Þ = ðf ðX i ; uÞÞ/ðg mix ðX i ÞÞ. Further, the filtering process of v j for the j th feasible resource to be refreshed can be expressed as In the above equation, the γ i ðZ i Þ can be expressed as Equation (12) can be substituted into Equation (13) to generate further refinement operation. After putting Equation (14) in Equation (15) Based on Equation (14), the following treatment can be applied to the corresponding weight π j of the jðj = 1, 2,⋯, KÞ feasible resource, which is first transformed into a differential expression as shown in

Experiments and Results Analysis
In order to determine the efficiency of the model, presented in this study, a performance comparison experiment was carried out as part of the current investigation. Between February and June 2020, a university in southwest China collected the entrepreneurial employment education resources and other related data. Further, the data interface for these resources was provided by the Wise Recruitment and BOSS Direct Recruitment (including entrepreneurial employment job listings, salary levels, and related company profiles), as well as the CLF model, which is an open-source education resource provided by the University of California at Berkeley (UCB). There are two groups of servers (two Lenovo SR501 servers with Xeon dual CPU, 16GB DDR4 memory, and 8 TB hard disk) with the same configuration in the experimental explicit environment: the servers realize direct data transmission through optical fiber, and they use the Windows 2010 S4ve operating system, the NTFS file system, and the SQL database to provide the basic software services for the two models; the same configuration is used in the simulation explicit environment. The design described above and the library service system have the same configuration settings, guaranteeing that the experiment is applicable to a wide range of situations. In order to ensure that the experiments are conducted in a fair manner, they are conducted in a parallel one-to-one fashion. The relevant application experiment process is as follows: Step 1: The online college entrepreneurial employment education resource demand collection unit traversed the entrepreneurial employment information collected for the first time through the API interface function provided by Wise Recruitment and BOSS Direct, from which the entrepreneurial employment list, salary level, and related company descriptions were downloaded, and the online entrepreneurial employment education resource demand vector was generated after unfolding and storing in the database for backup. Since this entrepreneurship employment education resource was scanned for the first time, the relevant information was not completed yet, and the contents in the demand vector were also limited.
Step 2: The resource collection unit collects the available resources from the relevant websites through a web crawler program.
Step 3: The resource processing and discovery driver unit performed the matching operations in the resource feature space based on the online entrepreneurship and employment education resource demand vector.
Step 4: The resource processing and discovery driver unit continuously track the changes in the collected entrepreneurship employment education resources; discover the potential needs of teachers and students from the updated entrepreneurship employment list, salary levels, and related company profiles; and updates the resource recommendation list based on this and subsequently recommends employment-related skills, job instructions, and other books or related materials for the teachers and students.
The final experimental results are shown in Table 1.
The model proposed in this paper is based on the demand mining of online entrepreneurship education resources, so the subjective indicators such as user satisfaction are high, and the two indicators of this model, "timely recommendation" and "timely update," greatly exceed the CLR model, reflecting the efficient and fast characteristics of the model proposed in this paper. Figure 1 shows a graphical comparison of the proposed mode with the utilized CLR model.
Secondly, in terms of the recommendation effectiveness of the models, 10 majors such as financial mathematics, economics, insurance, accounting, computer science, applied mathematics, basic mathematics, control engineering, * statistics, and international affairs were selected for the experimental analysis. Figure 2 shows the recommendation coverage of the two models in terms of the proportion of Mobile Information Systems the actual accepted recommended resources to the actually used resources. As the search scope of the experiment is on-campus, the demand and resources are more concentrated, so from the perspective of the experimental results, the coverage of both of the models is high (a bit higher than the application of off-campus resource recommendations). The experimental data of the above 10 online courses shows that the recommendation coverage of the proposed model is better than that of the CLR model, reflecting higher recommendation effectiveness. Figure 3 shows a comparison of the recommendation accuracies (the proportion of the actually accepted recommendation resources in the overall recommendation resources) of the two models.
From Figure 3, it is quite obvious that the recommendation accuracy of the proposed model was way better than the comparative CLR model in terms of all of the ten subjects.
The predicted settlement values of monitoring points ZH K0+300 and DH K3+800 are calculated according to GM (1, 1) and hyperbolic method modeling steps, and the predicted results are shown in Figure 4. Finally, in terms of model resource consumption, the experiments tracked the system resources (CPU computing resources and memory resources) piggybacking on the star consumption of the two models during operation through IBM's monitoring tool. Figure 4 shows a comparison of the average system resource consumption of the two models, with the test system using a system having the specification of Intel corei5, 7th generation, and 32GB DDR4 memory.
In short, the recommendation accuracy and coverage of the model proposed in this paper exceed the CLR  model in general, and the recommendation index of some entrepreneurship employment education resources is comparable to it, and the system consumes fewer resources and is more cost-effective, with good deployability and practicality.

Conclusion
Higher education institutions are graduating a large number of students each year. After graduation, most college and university graduates fail in getting a suitable job, which indeed is a serious issue. Keeping the shortage and unsystematic problems of entrepreneurial employment education resources in colleges and universities in consideration, this study proposes a data mining-based algorithm for collecting entrepreneurial employment education resources in colleges and universities. The experimental results show that the algorithm can make full use of entrepreneurial employment resources inside and outside of the colleges and universities, with good collection effect and high-cost performance. The performance of the proposed model is compared with the CLR model in terms of user satisfaction, recommended coverage, recommendation accuracy, and system resource consumption. The proposed data mining-based algorithm performed way better than the CLR model in terms of the mentioned performance measures. In the future, we will conduct in-depth research on the important issues related to the model proposed in this paper, such as moderate student demand sampling, interuniversity alliance application, and interdisciplinary innovation and entrepreneurship resource recommendation, to further expand the scope of application of the model proposed in this paper and to improve the adequacy of its collected information and richness of its recommended content.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The author declares that he has no conflict of interest.