Influence EvaluationModel of Microblog User Based on Gaussian Bayesian Derivative Classifier

College of Finance & Information, Ningbo University of Finance & Economics, Ningbo 315175, Zhejiang, China College of Digital Technology and Engineering, Ningbo University of Finance & Economics, Ningbo 315175, Zhejiang, China School of Information & Technology, Zhejiang Fashion Institute of Technology, Ningbo 315211, China Academic College, Ningbo University of Finance & Economics, Ningbo 315175, Zhejiang, China Aliated Hospital of Medical School of Ningbo University, Ningbo 315200, Zhejiang, China


Introduction
In recent years, social networks have gradually penetrated into all aspects of people's life and become the Internet service with the widest coverage of users, the fastest spread of in uence, the highest commercial practical value, and great development potential. rough them, more and more people build online social networks with other users to share their daily life and comment on the current hot events, including a large amount of data with complex structure and rich application scenarios, which provides a new perspective for the study of social life, which is not available in traditional sociology. In order to better analyze and utilize social network data, extract e ective popular events from massive data and correctly track key popular events, data mining technologies for social networks emerge in endlessly. Detecting and tracking popular events in social networks is a key point of network security, and has become a research hotspot of social network data mining. Popular event detection and tracking technology in social networks is the combination of traditional information retrieval technology and Internet technology. By analyzing the huge user group and a large amount of real-time data of social networks, we can detect popular events, track them, and observe their dissemination and evolution. For example, when an earthquake disaster occurs, a large amount of disaster information is released to social networks. By analyzing the earthquake and other crisis events in social networks, government departments can help predict the impact of disasters and provide the best services to the public during natural disasters. At the same time, the discussion on microblog has evolved with the development of hot events. Tracking these evolving hot events will help government departments make correct decisions and adjust plans in time.
However, a large amount of event information is generated on social networks every day, which not only greatly facilitates and enriches people's life, but also brings great challenges to the detection and tracking of popular events. At the same time, problems such as overload of event information on social networks and difficulty in timely and accurate identification and tracking of popular events continue to emerge. In addition, due to the high complexity of popular events and the huge capacity of social networks, the accuracy and efficiency of event detection and tracking in social networks have not reached a satisfactory level. erefore, how to efficiently and accurately detect and track these complex hot events has become an important research topic, and the emergence of event detection research in social networks can overcome some problems faced by hot event detection and tracking to a certain extent. e emergence of the Internet has changed people's production and lifestyle in an all-round way, and social public opinion has also changed from offline to online and evolved into network public opinion. As a new social network platform with both sharing and communication functions, micro-blog has swept the Internet with the trend of destruction since its emergence. e micro-blog platform represented by twitter, Sina micro-blog, and Tencent microblog has become the main position of network public opinion communication. Netizens have set up personal communities on microblogs, built online social networks, shared personal status in real time by publishing pictures, videos, expressions, and text combination information, and expressed their attitudes, views, and views on hot topics and public events.
Event detection in social networks refers to the discovery of popular events in the micro-blog text information published or forwarded by social network users. Due to the introduction of topic clustering technology, event detection can semantically represent popular events in social networks, so that the research on popular events in social networks is changed from computable qualitative analysis to computable quantitative analysis. erefore, the research of event detection for social networks has become very popular in recent years. Many researchers participate in it for research and design. At the same time, the social network has an extremely wide range of user participation, because users can establish their own social network by paying attention, forwarding, and replying, and users in the social network can receive all the information published by their associated users and continue to spread the received information. It is precisely because of the huge user base, rapid information flow network, and huge information coverage of social networks that many enterprises begin to use social networks as a platform for product promotion. We hope to quickly spread the product information through the relationship network between users. However, the amount of user information in social networks is very large, and the user preferences and types of information change greatly. It is difficult for enterprises to find the most suitable team in the massive data and obtain the maximum benefits at the lowest cost. e emergence of event communication research in social networks provides a more effective means to solve this problem.
Event propagation in social networks refers to that after users' microblogs are published, the social network platform will automatically push these microblogs to their neighbor users. ese neighbor users may push these microblogs to their neighbor users, so that they can spread to more users. erefore, event communication can help enterprises quickly spread product information through the relationship network between users. erefore, the research on event communication for social networks has become very popular in recent years. In addition, the popularity and rapid development of social networks have not only brought about fundamental changes in information and communication technology and changed people's lifestyle and interpersonal communication mode, but also since the emergence of social networks, a large number of user-generated content have emerged, making it difficult to identify old and new hot events efficiently and accurately, and the views of users in hot events e evolution of interest is difficult to find and many other problems follow. erefore, researchers introduce the study of social network event evolution to solve this kind of problem.
Event evolution in social networks refers to the spread and diffusion process of popular events in social networks, which is usually related to the mode of transmission of popular events. At the same time, social networks are the virtual presentation of user relationship networks in the real world.
erefore, by studying the evolution process of popular events in social networks, we can help us find social phenomena or hidden social problems that are difficult to detect in the real world, which brings new opportunities for the detection and tracking of popular events in social networks. It can be seen that the detection model, propagation model, and evolution model of popular events based on social networks provide a new perspective and methodology for the detection and tracking of popular events. At the same time, event detection, event propagation, and event evolution are the key components of popular event detection and tracking technology in social networks, and the quality of event detection model, event propagation model, and event evolution model is closely related to the overall performance of popular event detection and tracking technology in social networks.
As an important information promotion channel, micro-blog has a huge impact on social life [1][2][3][4][5]. However, users are the foundation of micro-blog relationship, so the developed micro-blog user network has become a common social network. e greater the influence of micro-blog user is, the greater the role in information dissemination [6][7][8][9]. On the one hand, micro-blog provides convenience for people's daily life; on the other hand, it also brings adverse impact to the society. For example, rumors spread through microblogs, and their influence scope and spreading speed are huge.
erefore, scholars at home and abroad have conducted a lot of researches on the influence of micro-blog users. Jiaxin et al. [10] proposed a method to analyze and measure the social influence of users by predicting their ability to spread information based on the dissemination situation of social influence in micro-blog and in combination with the social network structure and user behavior factors, and obtained better influence estimation results. Yaqi et al. [11] proposed an evolution model of micro-blog user relationship network based on the internal characteristics of micro-blog user relationship network and the mean field theory to deeply analyze the dissemination dynamic behaviors of rumors on such evolution model and the topological statistical characteristics of such evolution model, and the results show that the micro-blog user relationship network has scale-free characteristics, and the degree distribution index is not only related to the probability of reverse connection, but also related to the node attraction degree distribution. In order to effectively depict the micro-blog user relationship, Zhiming et al. [12] gave the calculation method of user similarity based on various user attribute information by taking the micro-blog social network as weighted undirected graph, and the experimental results show that the user similarity based on social information has good adaptability in user relationship analysis and group mining. On the basis of K2 algorithm, Haoran et al. [13] proposed a new Bayesian structure learning algorithm which establishes the maximum spanning tree and gets the maximum number of parent nodes by calculating mutual information, and simultaneously searches the maximum spanning tree using ant colony algorithm to obtain the node order, and finally gets the optimal Bayesian network structure as per K2 algorithm. e experimental results show that this method solves the problem that K2 algorithm relies on prior knowledge and simplifies the search mechanism. Guofeng et al. [14] proposed a method to calculate the influence of micro-blog users in combination with the intersectionality characteristics of micro-blog field. is method identifies the field of micro-blog based on the similarity between the user's tweets and the field itself, and calculates the influence of users in each field according to the user attributes, respectively, and thus determines the influence of micro-blog users. Hao et al. [15] proposed a method to quantify the user's influence based on the scope of information dissemination on account of the quantification problem of user's influence. Meanwhile, by comparing real data sets and experimental results, the results show that compared with other measurement methods, this method is suitable for environments where data sets and time periods need to be limited, and the computational complexity is lower. Shaowu et al. [16] established a method to measure the micro-blog influence, behavior influence, and activeness influence based on the traditional influence measurement index and in combination with the user activeness, micro-blog value and message dissemination influence diffusion, and proposed the influence measurement model of micro-blog users based on these three new measurement methods.
However, the current research has not fully considered the intersectionality characteristics of micro-blog field, and many researches are not based on the micro-blog user relationship network, which greatly reduces the practicality and reference value of the results. As a result, by combining Gaussian Bayesian Derivative Classifier [17][18][19][20], this paper proposes a model to evaluate the influence of micro-blog users, which first proposes the influence depiction index of micro-blog users, and then gives the method to solve the above model by the use of Gaussian Bayesian Derivative Classifier based on the characteristics of the relationship between micro-blog users and the behavior characteristics of users themselves. Finally, the key factors affecting the evaluation model are deeply studied by simulation experiments. e structure of this paper is described as follows: Section 1 describes the research status of micro-blog users' influence; Section 2 presents the evaluation index and model of user's influence; solution is made based on Gaussian Bayesian Derivative Classifier in Section 3; simulation experiments are conducted in Section 4; Section 5 summarizes the whole paper.

Influence Evaluation Model of Microblog User
As a convenient social platform, micro-blog plays an important role in the information dissemination. Since there are many active users on micro-blog, and the ways of information release are wide, and the information dissemination is featured by rapid speed and wide range, which are not conducive to the information management on microblog, and extremely easy to create public opinion on the Internet, this paper will propose a system to evaluate the influence of micro-blog user and thus calculate the influence index of users based on actual data for the convenience of managing the micro-blog social platform. User influence is an index to measure the dissemination capacity of micro-blog users, and the greater the user's influence is, the greater the dissemination capacity and the impact on individual and even the society. User's influence factors include the relationship between the user's followers and fans and all kinds of behavior of users, which will directly determine the user's influence, and the influence evaluation system is constituted by analyzing the user's influence factors. is paper takes the coverage rate H of actual influence person-time of information dissemination, user's activity, and connection degrees as the evaluation indexes for micro-blog influence. Coverage rate includes the number of fans and followers, and the activity degree includes the number of user's original tweets, the frequency of retweeting and commenting and the number of private letters, and the dissemination capacity includes the frequency of forwarding, reading, collecting, liking, and commenting original tweets.
Specific algorithm of influence evaluation of micro-blog user is as follows: (1) e data are initialized. N users with higher activity degree and greater connection degree are randomly selected from all the micro-blog users, and the micro-blog data of these N users are collected, and the collected data include the number of original tweets, fans and followers, the frequency of being tweeted and tweeting, being commented and commenting, and being collected and collecting. (2) e user's activity degree is calculated. e user's activity degree can affect the user's influence to some extent, and there are a lot of silent fans on micro-blog who have followed some users, but they cannot help the dissemination of information and thus they are unhelpful for the user's influence. As a result, silent Mathematical Problems in Engineering 3 fans are removed through multi-user activity degree, and the activity degree is calculated as follows: where T(i) is the activity degree of the user i, and N is the number of original microblogs of the user i, and F is the number of fans of the user i, and L is the number of followers of the user i, and ω 1 , ω 2 , ω 3 are weighted values of corresponding impression factors, respectively, and t is the time interval of a time period. (3) e user's connection degree is calculated. Every micro-blog user is connected to each other, and the higher the connection degree is, the greater the information dissemination capacity, which will also enhance the influence of users. e calculation of connection degree between users is similar to that of user's activity degree, and the calculation is as follows: where T(i, j) is the connection degree between the user i and the user j, and U is the number of tweets of the user j forwarded by the user i, and V is the number of tweets of the user j commented by the user i, and S is the number of tweets of the user j collected by the user i, and λ 1 , λ 2 , λ 3 are weighted values of corresponding influence factors, respectively, and t is the time interval of a time period. (4) e user's coverage degree is calculated. e user's coverage degree is the number of active micro-blog fans of the user i, which has a great influence on the dissemination capacity of micro-blog and thus can indirectly improve the influence ability of microblog users. e calculation formula is as follows: where R, V, and M are the crowd covered when counting the number of retweeting, commenting, and following of the user, and N is the number of nodes on the Internet. (5) e influence of micro-blog user is calculated. e influence model proposed in this paper is composed of three indexes, namely, user's activity, connection, and coverage degrees, and its calculation formula is as follows: where T (i) is the user's activity degree, and T(i,j) is the connection degree between the user i and the user j, and H(i) is the user's coverage degree, and α, β and c are weighted values of three influence factors.
(6) It is required to repeat step 2 to step 5 until the influence of all users is calculated, and then jump to step 7. (7) e user's influence is sorted. Sorting the user's influence can find out the users with high influence in a more intuitive manner.

Solution Method Based on Gaussian Bayesian Derivative Classifier
In order to reduce the impact of zombie fans and spam microblogs on evaluation results, this paper proposes a model to evaluate the influence of micro-blog users based on Gaussian Bayesian Derivative Classifier, which considers both the relationship characteristics between micro-blog users and the behavior characteristics of each user, and Naive Bayesian Classifier with continuous attribute is established to identify the zombie fans, which can improve the classification efficiency and reliability to a certain extent. Specific algorithm is as follows: Step 1. e data are initialized. N users with higher activity degree and greater connection degree are randomly selected from all the micro-blog users, and the micro-blog data of these N users are collected, and the collected data include the number of original tweets, fans and followers, the frequency of being tweeted and tweeting, being commented and commenting, and being collected and collecting.
Step 2. X1,. . ., Xn are taken. C is the continuous micro-blog attribute (number of original tweets, fans and followers, the frequency of being tweeting and tweeting, being commented and commenting, being collected and collecting, etc.) and category. x1,. . .,xn, C is the value, and D is the selected data set with N records, and the data are generated randomly from the mixed distribution P, in which xij(1 ≤ i ≤ n, 1 ≤ j ≤ N) and cj are the jth recorded observation of Xi and C in data set D.
Step 3. e users are classified by Gaussian Bayesian Derivative Classifier to remove zombie users. Assuming that G1, G2 are two k-dimension populations, in which G1 is the real user and G2 is the zombie user, and its distribution density is p1(u),p2(u), respectively, and user group u� (u1,u2,. . .,um), in which the probability of u1 coming from G1 is q1, and the probability of u2 coming from G2 is (1-q1). e k-dimension space Rk is divided into (R1,R2), which satisfies If uЄR1, it means that the user u1 comes from the real user group G1; if uЄR2, it means that the user u1 comes from the zombie user group G2. No classification method can meet 100% accuracy rate, and the probability that the real user group G1 is misjudged as the zombie user group G2 based on this classification is calculated as per (6), and the probability that the zombie user group G2 is misjudged as the real user group G1 is calculated as per (7). 4 Mathematical Problems in Engineering en, average classification error f(R1, R2) of classification results is calculated as follows: Step 4. e similarity between the number of fans Fi of selected user and the number of real fans F i ′ is measured, and the calculation is as follows: where F is the average number of fans of an user, and F i is the average number of real fans of an user, and S F i , S F i ′ are the average number of fans and real fans of an user, respectively. In this way, the list of fans can be truly restored, and zombie fans are removed from the list of fans of the user, and then the data after the removal of zombie fans can be recalculated.
e influence of each user is calculated as per equation (4).
e influence of each user is sorted from the largest to the smallest.

Simulation Experiment
is paper counts such data as basic user information, list of fans, and tweets posted by the user within a month by taking the users of Sina micro-blog as experimental data, and the user's influence ranking calculated by the model proposed in this paper and the HRank model is compared with the influence ranking of existing users of Sina microblog, and statistics are made on top 10 user influence ranking of each model, and the results obtained are shown in Table 1. Table 1 shows top 10 users in the user's influence ranking of each model obtained after counting and calculating the newspaper micro-blog users in the micro-blog media ranking. It can be seen from Table 1 that the users who have a large number of fans are not necessarily influential, and that the user's influence ranking calculated by the model proposed in this paper is basically the same as that of Sina micro-blog. It thus can be seen that the model proposed in this paper is feasible. Figure 1 shows the comparison of coverage rates of influence person-time of top p% users under different algorithms, and considers the advantage-disadvantage relationship of an algorithm, and the results are shown in Figure 1. As can be seen from Figure 1, the algorithm proposed in this paper is slightly better than HRank algorithm, and its ranking is slightly lower than the real ranking on Sina micro-blog, but it is closer to the real ranking, which indicates that the algorithm proposed in this paper is very reasonable in the classification and screening of micro-blog information. Figure 2 shows the distribution of user's average daily tweets, and counting the user's average daily tweets can remove the inactive or zombie users from these users in a more intuitive manner. As can be seen from Figure 2, most users have 2-3 tweets per day, so a user with less than 1 tweet per day or only one tweet per day can be regarded as inactive or zombie user and such users can be ignored in the evaluation of user's influence.
As mentioned above, the user's activity degree is related to the ranking of user's influence, and the relationship between the user's activity degree and influence is studied in this paper, and the results are shown in Figure 3. As can be seen from Figure 3, there is almost a linear relationship between the user's activity degree and influence ranking within a certain range, but after exceeding a certain range, the user's activity degree has almost no significant impact on the user's influence ranking, which indicates that the user's influence ranking is also affected by other factors.
Five sets of data are classified and tested in this paper, and the error comparison is carried out with NBC algorithm, and the results are shown in Figure 4. As can be seen from Figure 4, the error curve chart of the algorithm proposed in this paper is always above the zero line, which indicates that the classification results of the algorithm proposed in this paper are better than NBC algorithm, and the curve difference of the algorithm proposed in this paper is smaller than NBC algorithm, which indicates that the algorithm proposed in this paper tends to be more stable.
Finally, the influence factors of the user's coverage degree are discussed, which may be a positive correlation with the number of tweets according to the calculation formula, and the experiment environment can be changed Mathematical Problems in Engineering to compare whether different activities have an impact on the calculation results. As can be seen from Figure 5, the greater the activity degree is, the wider the user's coverage degree. erefore, the user's activity degree will have an impact on the user's coverage degree. e number of active micro-blog fans of a user will be affected by whether an user updates the micro-blog, which directly reflects the user's influence. When there are fewer tweets, the influence of the  Microblogs (*100)

Ranking
Sina microblog HRank Model in this paper 1 People 's daily  2  1  2  Global times  1  2  3  Chutian city newspaper  3  3  4 China youth news 4 4 5 Xinmin evening news 7 6 6 Beijing news 6 5 7 Yangcheng evening news 8 8 8 Morning post 5 7 9 Peninsula city news 9 9 10 Dahe daily 10 10 6 Mathematical Problems in Engineering activity degree is obvious, and when there are more tweets, the activity degree will also have a greater impact on the coverage degree.

Conclusion
As for the evaluation problem of micro-blog user's influence, this paper proposes an evaluation mode based on Gaussian Bayesian Derivative Classifier, which first presents a model to depict the influence of micro-blog user by combining such indexes as activity, connection, and coverage degrees. And then gives the solution method of the above model using Gaussian Bayesian Derivative Classifier based on the characteristics of the relationship between micro-blog users and the behavior characteristics of users themselves. Finally, the key factors affecting the evaluation model are deeply studied by simulation experiments by taking Sina microblog users as experimental data. e results show that this algorithm has better adaptability than NBC algorithm. In the follow-up study, the dissemination characteristics of microblog information can be considered to improve the evaluation model of user's influence.
Data Availability e data supporting the findings of this study are included within the article.

Disclosure
is article has been submitted in EasyChair at the following link: https://wvvw.easychair.org/publications in 2019.

Conflicts of Interest
e authors declare that they have no conflicts of interest.