Multimodal Fake News Detection Incorporating External Knowledge and User Interaction Feature

With the development of online social media, the number of various news has exploded. While social media provides an information platform for news release and dissemination, it also makes fake news proliferate, which may cause potential social risks. How to detect fake news quickly and accurately is a difcult task. Te multimodal fusion fake news detection model is the current research focus and development trend. However, in terms of content, most existing methods lack the mining of background knowledge hidden in the news content and ignore the connection between background knowledge and existing knowledge system. In terms of the propagation chain, the research tends to emphasize only the single chain from the previous communication node, ignoring the intricate communication chain and the mutual infuence relationship among users. To address these problems, this paper proposes a multimodal fake news detection model, A-KWGCN, based on knowledge graph and weighted graph convolutional network (GCN). Te model fully extracted the features of the content and the interaction between users of the news dissemination. On the one hand, the model mines relevant knowledge concepts from the news content and links them with the knowledge entities in the wiki knowledge graph, and integrates knowledge entities and entity context as auxiliary information. On the other hand, inspired by the “similarity efect” in social psychology, this paper constructs a user interaction network and defnes the weighted GCN by calculating the feature similarity among users to analyze the mutual infuence of users. Two public datasets, Twitter15 and Twitter16, are selected to evaluate the model, and the accuracy reaches 0.905 and 0.930, respectively. In the comparison experiments, A-KWGCN model has more signifcant advantages than the other six comparison models in four evaluation indexes. Also, ablation experiments are conducted to verify that knowledge module and weighted GCN module play the signifcant role in the detection of fake news.


Introduction
News is information reported by mass media and based on objective facts. With the development of the Internet, online social media has gradually become the mainstream platform for publishing and disseminating news. Te booming development of social media has led to an exponential growth of news. On the one hand, the rapid growth of news can bring convenience. On the other hand, most of the news is not vetted for authenticity before publication, and consequently, the problem of the proliferation of fake news arises.
Any inaccurate information that deviates from objective reality can be called fake news [1]. In the Internet era, fake news spreads faster and has a worse impact. If fake news is not curbed, it will not only damage the credibility of the media but also bring serious harm to the society. Terefore, it becomes important to accurately and efectively identify the veracity of news and stop the spread of fake news.
However, fake news detection is a challenging task. Te difculties comprise four major components: (1) fake news is confusing and misleading in expression, and it is difcult to determine the veracity of news merely based on the analysis of news content. (2) News usually uses short texts to describe events, which makes it necessary to analyze the news content from various perspectives to discover information and features. (3) Te dissemination process of fake news is complex, and there is no rule or formula to describe, which makes it difcult to analyze the dissemination information of fake news. (4) In many cases, a user's decision to retweet a news is infuenced not only by the previous one but also by all the users who have retweeted the news.
Facing these challenges, many scholars have done a lot of research. To address the misleading and generalized characteristics of fake news (difculties (1) and (2)), scholars extract relevant information as subinformation based on the social context of news, such as dissemination paths [2,3] and information of users who retweeted the news [4,5], to assist the model in detecting news authenticity. Other scholars rely on machine learning to analyze the language style of news texts [6,7] to fnd the commonality in the expression style of fake news. To address the complexity of news dissemination (difculty (3)), some scholars represent the dissemination paths in the form of graphs and use graph neural networks to analyze [8]. Other scholars have analyzed the time series of news dissemination and mined temporal information [9]. However, most of methods extract single-modal features from the content or social context of news or simply combine textual features with social context features to obtain multimodal features. Tey all fail to realize that there is a large amount of contextual knowledge hidden in news, and the false knowledge conveyed by fake news is contrary to the existing knowledge system, ignoring the help brought by external knowledge system. In addition, when some models analyze the interaction relationships between users involved in retweeting the same news, they simply assume that users are only connected to the previous one in the propagation chain. In fact, users do not only see the comments of the previous one when reading a news but they can see the comments of multiple users, so their retweeting behavior is infuenced by multiple users in the communication chain.
To address these problems, this paper applies the combination of external knowledge and GCN to the detection of fake news. Te knowledge concepts in the news are extracted and linked to the wiki knowledge graph, and the corresponding knowledge entities and entity contexts in the knowledge graph are identifed and introduced into the model as external knowledge. Meanwhile, this paper is inspired by the "similarity efect" in social psychology, and the GCN is weighted according to the similarity of user features to simulate more realistic user interaction relationships. Finally, a multimodal fake news detection model, A-KWGCN, based on external knowledge and weighted GCN is constructed by fusing multimodal features based on coattention.
Te contributions of this paper are as follows: (1) while analyzing news texts, this paper introduces a variety of auxiliary information (such as external knowledge, user interactions, and communication chain), which jointly help the model detect fake news from multiple perspectives. (2) Tis paper introduces the concept of "similarity efect" in social psychology to weight the GCN, and making it more consistent with the realistic situation.
Te rest of this paper is arranged as follows. Te related research is presented in Section 2. Te A-KWGCN model is described in Section 3. Te experimental results and analysis are described in Section 4. Finally, Section 5 introduces conclusions and future work.

Related Work
Early fake news detection mainly relied on manual detection, and websites such as Snopes and PolitiFact [10] employed multiple experts to check the authenticity of political news. Although the manual detection method can accurately identify fake news, it is labor-intensive and time-consuming, and the number of news that can be detected is very limited. Te manual detection approach can no longer meet the needs of the rapidly expanding number of news with the emergence of the "Internet news era." Te advancement of artifcial intelligence technology has opened up new possibilities for detecting fake news, and computers' powerful computational capacity can swiftly and intelligently evaluate the authenticity of news. Tis kind of research has become one of the hot research works in academia. Te starting points for solving the problem of fake news detection through artifcial intelligence techniques are generally divided into two categories: fake news text-based detection and social media-based detection [11].

Text-Based Fake News Detection.
Text-based methods for detecting fake news mainly rely on analyzing the content of news. Przybyla [7] created a style-based classifer for topic extraction and style analysis of news texts and compared it with a CNN-based classifer to demonstrate the former's efectiveness in capturing emotional elements in news texts. Based on the experimental results, the authors of the paper also proposed the idea that fake news tends to use some dramatic, uncritical, and lurid words for news reporting. Karimi and Tang [12] proposed a model based on hierarchical discourse-level structure for fake news detection, which learns and constructs real and fake news in an automated, data-driven way. Zhang and Li [13] built a crossmodal semantic deviation detection model based on the diferences between the text and image of fake news and used semantic distance to detect fake news. Nasir et al. [14] proposed a hybrid CNN-RNN model by using a CNN layer of Conv1D for extracting the local features that reside at the text level and using a RNN layer for learning the long-term dependencies of the local features. Kaliyar et al. [15] proposed a BERT-based detection approach (FakeBERT) by combining diferent parallel blocks of the single-layer CNNs with the BERT, which improved the performance of fake news detection with the powerful ability to capture semantic and long-distance dependencies in sentences.

Social
Context-Based Fake News Detection. Social mediabased fake news detection methods mainly use the social context formed by the spread of fake news on social media. Liang et al. [2] focused on the infuence of user features on the fake news detection model. By extracting 5 new user features and combining them with verifed efective features of existing users, they constructed a brand new user feature set and applied it to the authenticity detection of weibo. Jin et al. [3] proposed a fake news detection model integrating multimodal features, using diferent feature extraction methods and machine learning detection models for different types of features. Bian et al. [8] frst applied GCN to the task of fake news detection. To study the propagation structure and dispersion structure of fake news, the authors used top-down and bottom-up GCNs to construct a BiGCN detection model and used DropEdge to mitigate the overftting problem. Chen and Freire [16] focused on the detection of fake news domains (fake news domains refer to original websites that publish fake news), mined the sharing structure of Twitter users to fnd politically related websites, and constructed domain interaction maps by using real-time social networks. Te topic-Agnostic classifer was used to score, rank, and classify the sites found, and fnally, an active detection model of fake news domain was constructed. Raza and Ding [17] started from users' behaviors and calculated the aggregate responses of all users on each news article to understand a user's view on a news article and detect fake news by analyzing users' interactions with news. Davoudi et al. [18] proposed a new model to determine news veracity based on the analysis of the propagation tree and stance network features dynamically. Kaliyar et al. [19] combined news, user, and community information to form a tensor representing social context. Te news content was fused with the tensor, and the coupling matrix tensor factorization was used to obtain the representation of news content and social context.
However, in terms of news text, most existing models have not sufciently mined the news text information. Tese models usually focus on the mining of semantic features, ignoring the background knowledge hidden in the news content and intrinsic relationship with the external knowledge. In terms of communication chain, existing models tend to emphasize the single chain from the previous communication node, ignoring the intricate interaction and mutual infuence among users in reality. Terefore, this paper introduces external knowledge features based on wiki knowledge graph and extracts user interaction features by building a weighted graph convolutional network based on "similarity efect," and constructs a detect model, A-KWGCN, that incorporates multimodal information to detect fake news.

A-KWGCN Model
In order to solve the above problems, this paper constructs an attention-based know-aware weighted graph convolutional neural (A-KWGCN) to detect fake news. Te model introduces knowledge graph as auxiliary information through knowledge mining and entity linking techniques to help the model analyze the textual content of news. GCN is used to analyze the relationship network between users who are in the propagation chain. Tis paper fully analyzed the mutual infuence relationship between users in the process of social network information dissemination. Based on the theory of "similarity efect" in psychology, the interaction between diferent nodes in the communication chain is refected by a weighting algorithm, and the Laplace matrix in GCN is improved accordingly, so that the model can more realistically refect the relationship and infuence between users in real information dissemination. Meanwhile, BERT is used to extract text features and GRU is used to extract propagation features. For the above multiple information, this paper uses the coattention mechanism for relevance analysis and fnally splices the features and inputs them into the classifer for probability prediction. Te specifc framework of the model is shown in Figure 1.
Te implementation of this framework mainly includes the following steps: ① data collection and preprocessing, ② knowledge mining and feature extraction of source information text, ③ text feature and propagation feature extraction based on BERT and GRU, ④ user interaction feature extraction based on weighted GCN, ⑤ correlation analysis of extracted features based on coattention, ⑥ fake news detection and classifcation.

Text Preprocessing and User Data Collection.
News source text often contains a lot of redundant information, such as HTML codes, URLs, special symbols, and stop words. In the experiment, these redundant information and noise must be removed to retain important information, so as to ensure that useless information will not afect the classifcation accuracy of the model. Tis process is called text cleansing. For the cleaned text, the third party library NLTK is used for word segmentation. After the above processing, the news text data are represented as S � s 1 , s 2 , s 3 , . . . , s n , where s i (i � 1, 2, 3, . . . , n) represents the i-th preprocessed news text.
User information is one of the important information to achieve intelligent detection of fake news. For the news text collection S, the users who have forwarded these news form the user collection U � u 1 , u 2 , u 3 , . . . , u m , and the personal information of each user is collected to form the user feature matrix X u � X u1 , X u2 , X u3 , . . . , X um , where X uj corresponds to u j , u j denotes the j-th user, and X uj denotes the feature matrix corresponding to user u j . Te user feature matrix X uj specifcally contains the following information: ① the number of words of the user's nickname, ② the number of words of the user's profle, ③ the number of the user's followers, ④ the number of the user's followers, ⑤ the number of tweets posted by the user, ⑥ whether the user is allowed to share his geolocation, ⑦ whether the user is verifed by his real name, ⑧ the number of years the user's account has been used, ⑨ the time elapsed from when the news was published to when the user retweeted the news, ⑩ whether the user account is protected or not. Te fnal result is X uj ∈ R v , where v represents the number of features, and v � 10 in this paper.

Knowledge Mining and Feature Extraction of Source
Information Text. Based on the preprocessed news text data obtained in Section 3.1, external knowledge is introduced as auxiliary information for fake news detection. Since fake news often spreads misleading and erroneous information, which is likely to contradict the existing human knowledge system, the introduction of external knowledge system is of Advances in Multimedia positive signifcance for detecting the truthfulness of news. In order to achieve the introduction of external knowledge, the following operations are required: textual knowledge extraction, knowledge entity linking, and knowledge feature extraction.

Textual Knowledge Extraction.
Knowledge extraction is the process of extracting the knowledge embedded in the source information through identifcation, analysis, and induction. Te purpose of this step is to mine words from news texts that may become knowledge mentions for subsequent linkage with entities in the knowledge graph. For the preprocessed news text data in Section 3.1, the third-party library, tagme, is used to extract the knowledge mentions from the text. Taking a specifc tweet as an example, its fnal extracted knowledge mentions are shown in Figure 2.

Knowledge Entity Linking.
Knowledge entity linking is the process of linking the knowledge mentions extracted from the text with the knowledge entities in the knowledge graph. Te entity linking work needs to overcome two difculties: (i) a mention corresponds to diferent knowledge entities, such as apple can link fruit apple or mobile phone brand apple. (ii) A knowledge entity corresponds to diferent mentions, for example, Beijing and Peking can be linked with the knowledge entity Beijing. To address the above difculties, this paper adopts candidate entity generation (CEG) and entity disambiguation (ED) methods mentioned by Shen et al. [20]. Te sequence of knowledge entities E � e 1 , e 2 , e 3 , . . . , e n for each news article is obtained, where n denotes the number of entities contained in each news. Meanwhile, in order to obtain richer external knowledge, for each knowledge entity e i , this paper selects 10 neighboring entities that are closely connected to it to form the entity context: where en ∼ e i denotes that there exists a mutual relationship between entity en and entity e i , KG denotes the knowledge graph, and en(e i ) denotes the entity context of knowledge    entity e i . For the sequence of knowledge entities E � e 1 , e 2 , e 3 , . . . , e n , after obtaining the entity context of each entity, the entity context sequence EN � en(e 1 ), en(e 2 ), en(e 3 ), . . . , en(e n )} can be formed. Taking the news shown in Figure 2 as an example, its entity link results are shown in Figure 3.

Knowledge
where en ′ (e i ) denotes the entity context embedding vector of entity e i , num(en(e i )) denotes the number of neighboring entities contained in en(e i ) (in this paper, num(en(e i )) � 10), and en j ′ denotes the embedding vector corresponding to the j-th neighboring entity in en(e i ). Subsequently, in order to obtain the knowledge embedding EE ′ that contains both knowledge entities and entity contexts, it is necessary to fuse E ′ and EN ′ . For the sake of computational simplicity, this paper fuses E ′ and EN ′ by computing the average value to obtain the sequence of knowledge embedding EE ′ � ee 1 ′ , ee 2 ′ , ee 3 ′ , . . . , ee n ′ , which is calculated as shown in the following equation: Finally, the obtained knowledge embedding sequence EE ′ is input into the CNN, and the knowledge feature matrix Q ′ , which contains both knowledge entity and entity context information, is output and used in the subsequent classifcation calculation.

Text Feature and Propagation Feature Extraction Based on
BERT and GRU. In order to be able to obtain information about the news content for fake news detection, features of the news text need to be extracted. Te preprocessed news text data s i obtained in Section 3.1 is one-hot coded to obtain a high-dimensional 0-1 matrix. Since the length of each text is diferent from each other, the maximum text length L needs to be set to make the 0-1 matrix dimension of each news uniform, and the part smaller than the L is flled with 0 to get OH � oh 1 , oh 2 , . . . , oh L , where oh i denotes the one-hot encoding of the i-th word. Te OH is encoded by using word2vec to obtain the word embedding, where V denotes the word embedding V � v 1 , v 2 , . . . , v L : BERT is a deep bidirectional language representation model, which has great advantages in obtaining the deep semantic information of text [15,[21][22][23]. As shown in equation (5), V is input into BERT to obtain the textual features of news R � r 1 , r 2 , . . . , r L , R ∈ R d×L , where d denotes the dimensionality.
For the user information X obtained in Section 3.1, the propagation features P � p 1 , p 2 , . . . , p η of the news can be obtained using the GRU model, with p i � GRU(X ui ). Note that, for each of the news, the number of retweet users should be the same, and the specifc number is set to η. Te order of X ui should be the same as the sequence of retweets.

User Interaction Feature Extraction Based on Weighted GCN.
Te purpose of crawling user information in Section 3.1 is to obtain information about user interactions. In this paper, we hypothesize that the interaction information between users helps in fake news detection. Tis hypothesis is validated in Section 4.4. To obtain potential interaction information among all users who retweeted news s i , this paper uses GCN to extract user interaction features. In previous literature, only the interaction between a user and the previous one in the propagation chain is studied, but in practice, a user's decision to retweet a news item may not be infuenced by the previous one only, but by all users who have retweeted, the news item before that. Take the social networking platform Twitter as an example, users can view the tweets of any other users and retweet the tweet they agree with. Terefore, in the real news dissemination process, it is very common that the target user interacts with multiple users, and we cannot ignore the positive impact of this phenomenon on fake news detection. Based on the above, a weighted GCN is created in this paper. First, a user social network graph G � (V, Ɛ) is created, where V denotes the node, and Ɛ denotes the edge. In this paper, node V is the set of users who have retweeted news s i , and Ɛ is the interaction relationship that exists between users. Since the interaction between users in the actual communication process is difcult to obtain, and we still want to simulate a more realistic user interaction situation, the "similarity efect" in social psychology is introduced. Te concept suggests that humans are more inclined to accept the views of those who are similar to them [24]. Terefore, the user social network graph G created in this paper is a fully connected graph, where each edge Ɛ ij is associated with a weight w ij . Te specifc value of the weight w ij is obtained by calculating the cos similarity between the feature vectors X ui , X uj (obtained from Section 3.1) of user i and user j. Te calculation formula is shown in equation (6). Te fnal weight matrix W Ɛ � [w ij ] ∈ R η×η is obtained, where η denotes the number of users.

Advances in Multimedia
GCN is a multilayer neural network that processes non-Euclidean data and can generate embedding vectors of nodes by obtaining information from their direct and indirect neighbors through the operation of stacked hierarchical convolution. In this paper, we use GCN to analyze the created user social network graph G. In the GCN network, the propagation formula between layers is shown in the following equation: where D − (1/2) AD − (1/2) is the Laplace matrix, A is the dependency matrix, D is the degree matrix of A, H (l) is the feature vector of the l-th layer, H (0) is set to the user feature matrix X u obtained in Section 3.1, and W (l) is the weight matrix of each layer. σ is a nonlinear activation function, such as ReLU function. By the above operation, the user interaction features G ′ � g 1 , g 2 , . . . , g η can be obtained.

Correlation Analysis of Extracted Features Based on
Coattention. For the knowledge features (Section 3.2), text features (Section 3.3), user interaction features (Section 3.4), and propagation feature (Section 3.3), this paper uses the method of Hu et al. [10] to construct coattention mechanism to learn the correlation between source tweet text information and knowledge information, and the joint infuence between propagation feature and user interaction information, respectively. Take the analysis of source tweet text information and external knowledge information as an example. First, the proximity matrix F is calculated as shown in the following equation: where R is the text feature obtained in Section 3.3, Q' is the external knowledge feature obtained in subsection 3.2, and W f is the weight matrix. Ten, the proximity matrix F is brought into equation (9) to calculate and get the attention maps H R , H Q ′ for the source tweet text information and external knowledge information, respectively.
where W R , W Q ′ are the weight matrices. Te obtained H R , H Q ′ are brought into equation (10) for calculation to obtain the attention probabilities a R and a Q ′ of each word and knowledge. Te weighted summation obtains the attention vectors R and Q ′ for text information and external knowledge information, as shown in the following equation (11). In the same way, the attention vectors P and G ′ of the propagation information and user interaction information can be obtained.
where w hr , w hQ ′ are the weight matrix, and r i , q j are the learned coattention feature vectors.

Fake News Detection and Classifcation.
Te P, G ′ , R, and Q ′ obtained in 3.5 are spliced and input to a classifer consisting of a fully connected neural network to obtain the fnal classifcation probability distribution y � [y 0 , y 1 ], where y 0 denotes the probability that the target news is classifed as true news y 0 and y 1 denotes the probability that the target news is classifed as fake news y 1 . And the loss function formula is used as shown in the following equation:

Experimental Results and Analysis
Te codes of this experiment were written using Python 3.7 and run through Google artifcial intelligence platform.

Dataset.
In this paper, two datasets, Twitter15 and Twitter16 [25], are selected as the original datasets. Te datasets contain fact-based source tweet text, labels, and user retweet sequences. Personal information is not given in the dataset due to Twitter's protection of users' personal information, and thus, the experimenter needs to crawl it by using the Twitter API according to the user ids in the dataset. Te user information to be crawled in this paper is described in Section 3.1. Te data with "true" and "fake" labels are selected for analysis, and the details are shown in Table 1. Te dataset is divided into a training set and a test set in the ratio of 7 : 3.

Parameter Setting.
In this paper, we detect the authenticity of fake news and classify them into two categories: "true" and "fake." Te parameter settings of the model in this experiment are shown in Table 2. Since the length of each tweet text is diferent and the number of retweeted users per text is also diferent, the maximum text length is set to 30, and the number of users per text is set to 40. For texts with more than 40 retweeted users, only the frst 40 users are retained; for texts with less than 40 retweeted users, a random sample of existing users is required to fll the gap until the number of users reaches 40. Kipf and Welling [26] pointed out that the number of layers of GCN should be controlled in 1∼3. In this paper, the number of layers of GCN is set to 2. Te softmax function is chosen as the activation function of the fully connected neural network classifer, and the ReLU function is used as the activation function of GCN, which is σ in equation (7). Meanwhile, this paper introduces external knowledge based on the wiki knowledge graph. Te wiki knowledge graph is free and open, and its contents are all structured information that can provide support for A-KWGCN model.

Comparison of Diferent Models.
In this paper, the experimental results of A-KWGCN are compared with some current state-of-the-art models and some representative baseline methods to demonstrate the performance of the model proposed in this paper. Te comparison models chosen in this paper are shown as follows: DTC [27]: A decision tree based detection model, which achieves automatic news trustworthiness assessment by tagging the topics of news content and extracting relevant topic-word features. SVM-TS [28]: A detection model built on support vector machine, which captures time series features and combines them with source tweets to jointly analyze the truthfulness of news. mGRU [29]: A detection model built on an improved recurrent neural network is used to capture the contextual information of tweets over time for learning temporal features and textual features. CSI [30]: A multimodal fake news detection model that combines and analyzes three types of information: source text information, user information, and information about the group of users involved in retweeting news. dEFEND [31]: An advanced fake news detection model that analyzes source news and user profles. And this model is dedicated to learning the relationship between news data and user data through coattention mechanism. GCAN [9]: A graph-aware network based model. It predicts the authenticity of the target news by analyzing the source text and the corresponding sequence of retweeted users. It is a relatively advanced model for fake news detection. Rumor2vec [32]: A model detects fake news based on the text content and the propagation structure. It presents the concept of the union graph to incorporate propagation structures of all tweets and leverages network embedding to learn representations of nodes in the union graph. GAN [33]: A model uses graph attention neural network model to learn text features and syntactic relations to detect fake news.
In this paper, F1-score, Recall, Precision, and Accuracy are chosen as evaluation metrics to compare the performance between diferent models. Te conducted train test is repeated 10 times, with the average values provided. And the specifc results of the comparison experiments are shown in Table 3. It is shown that A-KWGCN outperforms other models. Its performance is improved by 5.3% and 9.4% on average on the Twitter15 and Twitter16 datasets, respectively. And it demonstrates the efectiveness of the A-KWGCN model in detecting fake news.
We can also fnd the following conclusions in the table: (1) Te models using deep learning methods generally outperform those using hand-crafted features (e.g., DTC, SVM-TS). Tis is because deep learning based Advances in Multimedia 7 models are able to extract higher level news text features and user features, which also proves that introducing deep learning methods for detecting fake news is an efective research direction. (2) Te CSI, dEFEND, Rumor2vec, and GAN combine information from multiple modalities (e.g., text features, user features, and retweet sequence features), and its performance is signifcantly better than models that analyze single modal feature (e.g., DTC, SVM-TS, mGRU), which also proves that the focus of fake news detection is on the selection and analysis of multi-modal features. Building richer multimodal information features is an efective way to improve model detection performance. (3) GCAN analyzes text, user, and forwarding sequence features just like CSI. However, compared to CSI, which simplifes user relationships to Euclidean data, GCAN constructs a user social network and introduces graph neural networks into the model to create a graph-aware fake news detection model, and its performance is also greatly improved compared to CSI. It also proves that the simplifcation of complex data will afect the detection efect of the model to some extent, and the improvement of feature extraction means can bring help to detect fake news. Among all the models, the A-KWGCN has the highest classifcation accuracy, which also proves that mining the background knowledge of news, introducing external knowledge features as auxiliary information and improving the GCN to mine the user interaction features that are closer to reality can bring positive efects on the detection of fake news, which also provides new research ideas and models for the detection and classifcation of fake news. With the deletion of user interaction information, the user-propagation coattention mechanism also needs to be deleted. A-KWGCN/K: Te knowledge-aware module of the original model is deleted. With the deletion of external knowledge information, the text-knowledge coattention mechanism also needs to be deleted. A-KWGCN/ALL: By removing all auxiliary information including external knowledge, user interactions, and propagation information, the model analyzes only the textual content of the news.

Ablation
In the "A-KWGCN/G" model, the flter size of CNN is set to 3, and the output dimension is set to 32. "A-KWGCN/ G," "A-KWGCN/K," "A-KWGCN/UI," and "A-KWGCN/ ALL" are used to verify the role of weighted GCN in extracting user interaction information, the role of introducing external knowledge system, the role of analyzing user interactions, and the role of news content, respectively. Te results of the ablation experiments are displayed in Figure 4. It can be observed that the model accuracy is in the order of "A-KWGCN">"A-KWGCN/K">"A-KWGCN/ G">"A-KWGCN/UI">"A-KWGCN/ALL." Tis illustrates that the graph-aware module, the knowledge-aware module, the user-interaction module, and the textual content of the news all provide great help to the detection of fake news.  Introducing external knowledge systems and extracting user interaction information that is closer to the real propagation process as auxiliary information are feasible methods. And the decrease in the accuracy of the A-KWGCN/UI model also proves the scientifc validity of the hypothesis that "user interaction information is helpful for false news detection." At the same time, the detection accuracy of "A-KWGCN/ ALL" is the lowest, which indicates that there is a signifcant decrease in the classifcation accuracy of the model after removing all the auxiliary information. It also proves that the positive impact brought by the auxiliary information on the model to detect fake news cannot be ignored, and this also provides new ideas for the construction of future fake news detection models.

Conclusion and Future Work
While social media provides the convenience for news dissemination, it also leads to the rapid spread of fake news. Te intelligent detection of fake news is helpful to develop a healthy network environment and keep society stable. At present, the problems in the fake news detection models are as follows: (1) the mining and refning of fake news content are not sufcient, and the background knowledge hidden in the news content and its internal relationship with the existing knowledge system are often ignored. (2) In the analysis of the user propagation relationships, existing models simply assume that users only interact to the previous one, ignoring the intricate interaction relationship among users. To address these problems and achieve intelligent detection of fake news, this paper proposes a multimodal fake news detection model by mining the hidden knowledge concepts in the news content, and linking them to wiki knowledge graph to introduce knowledge entities and entity contexts as auxiliary information to help the model detect fake news. Meanwhile, this paper improves the GCN based on the "similarity efect" in social psychology. And a weighted GCN is constructed by calculating user feature similarity to simulate the real interaction among users. In this paper, two public datasets, Twitter15 and Twitter16, are selected for comparison experiments and discussions, and the model is compared with several models in terms of classifcation accuracy and efciency. According to the experiment results, the classifcation accuracy of this model reaches 0.905 and 0.930 in the two datasets, respectively, and the model performs better than other models in four evaluation metrics and obtains the highest classifcation accuracy. Also, ablation experiments and analytical discussions are conducted and demonstrated the signifcant role of knowledge graph module and the weighted GCN module in classifcation efciency. Te research in this paper provides new research ideas and models for the detection and classifcation of fake information.
Te ability of the model to achieve better classifcation accuracy is important, but we argue that it is equally important to explain the classifcation results, i.e., why the target news is judged as true news or fake news. However, it is missing in this experiment. In future work, we will work on achieving interpretable detection of fake news by analyzing the reasons why the target news is judged as fake (or true) in terms of news content, user characteristics, and external knowledge and by transforming complex calculation into information that people can understand, and improving people's trust in the classifcation results.

Data Availability
Te Twitter15, 16 data used to support the fndings of this study have been deposited in the https://www.dropbox.com/ s/7ewzdrbelpmrnxu/rumdetect2017.zip?dl=0.

Conflicts of Interest
Te authors declare that there are no conficts of interest. Advances in Multimedia 9