KC-GCN: A Semi-Supervised Detection Model against Various Group Shilling Attacks in Recommender Systems

,


Introduction
The amount of Internet data is exploding with the rapid development of information technology, consequently leading to the increasingly prominent problem of information overload.By analyzing a user's historical behavior information, recommender systems can extract user preferences and automatically recommend favorite items or services to users [1][2][3], which have become an essential component of many online information services, including e-commerce [4,5], live broadcast platforms [6], personalized travel recommendation systems [7], and Internet of Vehicles wireless systems [8], among many others.However, due to their openness, fraudulent users can create and inject a large number of fake user profiles into recommender systems, which can change recommendation results and reduce user experience.For example, The New York Times and Buzzfeed News have reported that many sellers turned to black hat tactics to drive Amazon sales on their products (https:// pattern.com/news/pattern-analysis-on-amazon-star-ratingfeatured-in-new-york-times-buzzfeed/).In recent years, var-ious types of shilling attack models have been presented, including random attacks [9], average attacks [10], and the latest adversarial attacks [11].Group shilling attacks have also been proposed to generate a group of attack profiles on the basis of the abovementioned individual shilling attacks [12].Research on group shilling attacks showed that group attacks greatly affect recommender systems when compared to traditional individual attacks [13,14] because attack users in the same shilling group collude with each other to attack targets, while each attack profile looks more like a genuine profile [15].Nowadays, people have become increasingly conscious of the importance of shilling attack governance in recommender systems.Many service platforms, such as Amazon, Tripadvisor, and Taobao, are constantly seeking efficient mechanisms to enhance user experience and satisfaction (https://www.bbc.com/news/business-61154748).Therefore, accurate detection under group shilling attacks has emerged as a crucial problem for the existing recommender system security.
In recent years, various detection approaches have been put forward to defend recommender systems from group shilling attacks [16][17][18][19][20][21].Most of these methods detect shilling groups based on frequent synchronization behaviors on more than one item or through the analysis of the differences in the rating pattern of genuine and attack users.However, these approaches do not perform well if the group attack profiles are generated and injected based on AOP [9], adversarial attacks, or mixed attacks because all attackers in the same shilling group may not attack identical target items.The injected attack profiles are also diverse and look more like genuine ones under these attacks.
To solve the abovementioned constraints, we present herein a two-stage method, called KC-GCN.It is a semisupervised group shilling attack detection model based on k-cliques and the graph convolutional network (GCN) [22,23].First, a user relationship graph is generated, and influential users are extracted using the k-clique algorithm and the user nearest-neighbor similarity on the graph.We construct the user relationship graph by calculating the edge weight between any two users through the analysis of their similarity over suspicious time intervals on each item.Second, we obtain the user initial embeddings and train a GCN-based classifier.
The significant contributions of this work are as follows: (1) We construct a weighted user relationship graph, in which the weight is calculated from the perspectives of user preference, attack intention, and time synchronization, to highlight the user relationship between attack users (2) We use the multilayer graph convolution network to fuse the initial embedded features extracted from the user rating behavior with the structural features hidden in the user relationship graph to extract more effective detection features (3) The experiments on the Netflix and Amazon datasets demonstrate that KC-GCN outperforms baseline methods in terms of detecting various types of group shilling attacks The rest of this paper is structured as follows: Section 2 presents the background information and related work, Section 3 provides a detailed description of the proposed detection methodology primarily divided into two sections (i.e., extracting influential users and identifying attack users using the trained semi-supervised classifier), Section 4 provides a comparative analysis of the experimental findings, and Section 5 presents the conclusions.

Background and Related Work
2.1.Group Shilling Attacks.To escape from the existing methods of detecting individual shilling attacks (e.g., random attack, average attack, and AOP attack), Wang et al. [24] proposed two generative models of group shilling attack, called GSAGen s and GSAGen l .In these attack models, fake profiles are first generated based on one type of individual shilling attacks.Based on which, group shilling attack profiles are then constructed and injected into a set of genuine profiles.The GSAGen s model has more stringent conditions when generating group shilling profiles; hence, the group size under GSAGen s is smaller than that under GSAGen l .Considering the attack effect on the target items, we only use GSAGen l to generate the group shilling attack profiles, in which the fake profile includes the selected item set, the filler item set, the target item set, and the unrated item set.More details for the group shilling attacks used in this paper are described as follows: (1) GSAGen l Ran: generate loose group attack profiles based on a random attack, where the selected items are null, the filler items are randomly chosen, and only one attacker from the whole group rates the items.The filler item rating is the system mean.
The target item rating is set to r max or r min (2) GSAGen l Ave: generate loose group attack profiles on the basis of an average attack, where the selected items are null, the filler items are randomly chosen, and only one attacker from the whole group rates the items.The filler item rating is the item mean.
The target item rating is set to r max or r min (3) GSAGen l AOP: generate loose group attack profiles based on 50% AOP attack, where the selected items are null, the filler items are randomly chosen, and only one attacker from the whole group rates those items with top 50% popularity.The filler item rating is the item mean.The target item rating is set to r max or r min (4) GSAGen l GOAT: generate loose group attack profiles based on the adversarial attack, called GOAT [11], where each fake user's selected and filler items are randomly chosen from an item-item graph based on genuine user profiles.A generative adversarial network is used to generate the ratings of the selected and filler items based on the genuine rating distribution.The target items have ratings of r max − 1 or r min + 1 (5) GSAGen l Mixed: generate mixed multiple shilling groups generated according to the four abovementioned group shilling attacks proposed a hybrid shilling attack detection model based on the convolutional and recurrent neural networks, which first converted the rating matrix into a three-dimensional array of users, products, and days; extracted the user feature vector by using the convolutional neural network (CNN) model; and finally used the RNN model to divide users into two categories: genuine and attacker users.This model did not rely on specific types of attacks and considered the user characteristics in the time dimension.However, the experimental results on the two datasets of Netflix and MovieLens showed that the detection performance was extremely unstable as the filler size changed.Zhou and Duan [32] proposed a coforest algorithm-based semi-supervised recommendation attack detection method that requires setting a reasonable value for each hyperparameter.Zhang et al. [33] proposed GraphRfi, which trains the GCN to obtain the prediction error and introduces neural random forests to detect fraudulent users.Similar to that in [32], the method also requires multiple hyperparameters, and the detection result is easily affected by the hyperparameters.
2.2.2.Group Attack Detection Methods.Zhou et al. [16] proposed the DeR-TIA to identify group attack profiles.In the first stage, they calculated the user profile attributes using improved RDMA and DegSim.In the second stage, they filtered out attack profiles by using the target item analysis.This method works well for identifying high-correlation attack profiles but fails to detect attack groups with a strong diversity.Zhou et al. [17] proposed a detection method, called SVM-TIA, based on the support vector machine and target item analysis.This method can improve the detection precision by using target items but does not have a high recall.Zhang and Wang [18] proposed the GD-BKM method to detect group shilling attacks.They generated candidate groups based on the rating tracks for each item and calculated the candidate group suspiciousness using the user activity and group item attention degree.They then finally spotted attack groups by using the bisecting k-means algorithm.This method can exhibit an excellent detection performance, regardless of the number of target items.However, it becomes less effective when the size of the shilling group is small.Zhang et al. [19] proposed the GAGE method based on graph embedding.First, they extracted user embeddings using the Node2vec method.Next, they obtained candidate groups by using the k-means++ algorithm and calculated the group suspicious degrees.Ultimately, they identified attack groups using Ward's hierarchical-clustering algorithm.Their method uses Node2vec sampling with a certain randomness, thereby leading to deviations in the candidate group division and unstable detection results.Meanwhile, Yu et al. [20] proposed the GAD-MDST method based on maximum dense subtensor mining.This method can automatically generate multiscale user features by fusing a CNN and a feature pyramid network but is not suitable for detecting smaller-sized shilling groups.In our previous work [21], we proposed the TP-GBF method by using strongly correlated behaviors among group members and group behavior characteristics, which combined indirect behaviors with the direct collusion behaviors to highlight the collusive relationship between attackers in the same shilling group.TP-GBF performed well on the Netflix dataset but was less effective on the real dataset because it failed to detect smaller-sized attack groups.
For easy comparison of the above works, we summarize them in Table 1.

GCN-Based Group Shilling Attack Detection Model
Figure 1 depicts the two stages of the KC-GCN detection framework: influential user extraction and attack user identification.In the first stage, we build the user relationship graph by determining the user similarity based on the item suspicious time window.Next, we use the k-clique community discovery algorithm to generate suspicious candidate groups.Finally, we obtain the influential users by calculating the user nearest-neighbor similarity.In the second stage, we extract the users' initial embeddings from four dimensions.
We then combined the extracted user initial embeddings with the structural features hidden in the user relationship graph to train a semi-supervised classifier based on a twolayer GCN, which only utilizes the labels of the identified influential users.The notations used in this paper are described in Table 2.

Constructing a Weighted User Relationship Graph.
Attackers in a shilling group typically cooperate to quickly enhance or demote the recommendation of one or more target items.Based on this characteristic of group attacks, the rating distribution of a target item may fluctuate during the attacked time period.Therefore, we construct a weighted user relationship graph by extracting the suspicious time windows of the suspicious items and calculating the correlation between users within the suspicious time windows.

Wireless Communications and Mobile Computing
Definition 1 (item window abnormal degree, IWAD).For ∀p ∈ P and ∀w ∈ W, the abnormal degree of item p on window m refers to the ratio of the number of users who rated item p with high ratings to the total number of users who rated it on time window w, which is referred to IWAD p,w and calculated by where NR p,w represents the number of ratings of item p on the time window w.The time window is regarded as suspicious if IWAD p,w > 0:5.Γ ðu,p,wÞ is an indicator function, which is formulated as Definition 2 (user rating synchronization, URS).For ∀u i , u j ∈ U, their rating synchronization refers to how close their rating behavior is within the suspicious time window, which is denoted as URSðu i , u j Þ and calculated by where Nðu i , u j Þ represents the set of items corated by users u i and u j , and the rating time is within the suspicious time window of item p;that is, IWAD p,w > 0:5.Definition 3 (user short preference similarity, USPS).For ∀ u i , u j ∈ U, their short preference similarity is defined as the ratio of jNðu i , u j Þj to jNðu i Þj ∪ jNðu j Þj − jNðu i , u j Þj, which is denoted as USPSðu i , u j Þ and calculated by where Nðu i Þ and Nðu j Þ represent the rating item set of users u i and u j , respectively.jNðu i , u j Þj represents the number of items for which user u i and user u j have the same preference within the suspicious time window, and jNðu i Þj ∪ jNðu j Þj represents the total number of items rated by user u i and user u j .
Definition 4 (user similarity, US).For ∀u i , u j ∈ U, their user similarity refers to the closeness of their rating times and similarity of their preferences on suspicious items, which is denoted as USðu i , u j Þ and calculated by Based on the above definition, a weighted user relationship graph can be constructed.The weighted user relationship graph construction algorithm is described as follows.
Algorithm 1 is divided into two parts.The first part (lines 1-6) calculates the suspicious time window for each item, with a time complexity of Oðn * vÞ.The second part (lines 7-21) calculates the relevance degree US of each user and constructs a user relationship graph, with a time complexity of Oðm 2 Þ.In conclusion, Algorithm 1 has a time complexity of about Oðm 2 Þ.
3.1.2.Extracting Influential Users.Li et al. [34] proposed the maximization problem that is aimed at selecting seed nodes from numerous nodes, thereby maximizing the influence of information on large-scale network transmission [35].Inspired by the seed node idea, we only used the influential node labels to reduce the cost of labeling a large number of samples.
We present herein a two-stage method for extracting influential users.First, we generate suspicious candidate groups on the weighted user relationship graph using the k -clique algorithm [21].Next, we extract influential users by calculating the user nearest-neighbor similarity.
The main steps of generating candidate groups based on the k-clique algorithm are as follows: (1) Traverse each node in the user relationship graph to find a complete subgraph The number of elements in a set X Users' initial embedding matrix 5 Wireless Communications and Mobile Computing Definition 5 (user nearest neighbor similarity, UNNS).For ∀u i ∈ U, the user's nearest neighbor similarity refers to the average similarity between the user and its neighbors, which is denoted as UNNSðu i Þ and calculated by where URðu i , u j Þ represents the similarity of users u i and u j .
Definition 6 (influential user, IU).Influential users refer to those users whose nearest neighbor similarity is larger than that of all its first-order neighbors.
The algorithm for extracting influential users based on the k-clique algorithm and user nearest neighbor similarity is described as follows.
Algorithm 2 is divided into four parts.The first part (lines 1-12) identifies tight communities in the graph and generates a community relationship matrix with a time complexity of Oðm * lÞ + Oðl 2 Þ.The second part (lines [13][14][15][16][17][18][19] merges communities to generate a community adjacency matrix with a time complexity of Oðl 2 Þ + Oðl 2 Þ.The third part (lines 20-24) generates candidate suspicious groups according to the community adjacency matrix, with a time complexity of Oð1Þ.The last part (lines 25-29) extracts an influential user set based on the user nearest neighbor similarity, with a time complexity of Oðl * mÞ.In conclusion, Algorithm 2 has a time complexity of about Oðm * lÞ.

Detecting Attack Users
3.2.1.Generating User Initial Embeddings.Some nodeembedding methods (e.g., matrix factorization and autoencoders) are automatic but usually generated using a randomization strategy and cannot represent well the initial node embeddings.Therefore, we generate the user embeddings herein from four perspectives.Definition 7 (user lifetime proportion, ULP).For ∀u i ∈ U, the user lifetime proportion refers to the ratio of the lifetime of user u i in the system to the lifetime of the entire system, which is denoted as ULPðu i Þ and calculated by where URTðu i , maxÞ and URTðu i , minÞ represent the latest and the earliest rating time of user u i , respectively.SL represents the lifetime of the entire system.
Definition 8 (user nearest neighbor rating synchronization, UNNRS).For ∀u i ∈ U, the user's neighbor rating synchronization refers to the average rating synchronization between the user and its neighbors, which is denoted as UNNRSðu i Þ and calculated by where URSðu i , u j Þ means the synchronization degree of user Input: the rating matrix R, the rating time matrix T, the size of sliding time window W s , the time window anomaly threshold δ, and the relationship strength threshold σ Output: a weighted user relationship graph G 1. E ⟵ ∅ ; C ⟵ 0 m×n 2. for each item p ∈ Pdo 3.
for each time window ∀w ∈ Wdo 4.
end for 6. end for 7. for each user u i ∈ Udo 8.
for each user u j ∈ Udo 9.

6
Wireless Communications and Mobile Computing u i and user u j and jNeighborðu i Þj represents the number of first-order neighbors of user u i .
Definition 9 (user nearest neighbor preference similarity, UNNPS).For ∀u i ∈ U, the user's nearest neighbor preference refers to the average preference similarity between the user and its direct neighbors, which is denoted as UNNPSðu i Þ and calculated by where UPSðu i , u j Þ represents the preference similarity of user u i and user u j .Based on the above definition, we extract the initial embedding X u = ðULP u , UNNRS u , UNNPS u , UNNS u Þ of user u.

Identifying Attack Users
Based on the GCN.In previous graph embedding-based group shilling attack detection methods, researchers focused on how to obtain highquality user node embeddings in the graph.Zhang et al. [19] obtained a low-dimensional embedding vector of nodes in the graph by adopting the Node2vec method that focuses on obtaining the structural characteristics of the user's topological neighborhood but ignores the characteristic information of the nodes themselves.The existing group attack detection methods also use hard classification, in which members from the same group are classified as genuine users or attackers, resulting in the misclassification of some users [18][19][20][21].To this end, we utilize user high-quality embedding features from both implicit and explicit perspectives by combining user initial embeddings with their higher-order topological neighborhood structures based on the GCN and employing influential node labels to identify attack users.
We first extract the high-quality embeddings of user nodes based on the user initial embedding matrix X and the weighted matrix C. The GCN propagation process is formulated as follows: where 7 Wireless Communications and Mobile Computing represents the user initial embedding matrix.N is the total number of user nodes in the graph, and the feature vector of each user is represented as X u = ðULP u , UNNRS u , UNNPS u , UNNS u Þ. C = C + I N is the adjacency matrix by adding self-connection.D is the degree matrix, and Dii = ∑ j Ãij .W ðlÞ ∈ R P×H denotes the parameter matrix to be trained, P represents the length of the feature matrix, and H represents the number of hidden units.σ is the corresponding activation function, such as Re LUð⋅Þ = max ð0, ⋅ Þ.
High-quality user embeddings can be obtained after multiple convolutional layers.We utilize two convolutional layers and ReLU as the activation function.
We then calculate the cross-entropy between the real label one-hot vector Y of all influential user nodes and the label vector T predicted by softmax.Subsequently, we utilize the gradient descent method to train the parameter matrix W ð0Þ and W ð1Þ .The formula for calculating the loss function is as follows.
where ϒ L is the set of influential user nodes with labels.
Finally, the resulting model is expressed as where Z represents the set of user labels after classification by the softmax function.Ĉ = D−1/2 C D−1/2 represents the weighted matrix C after symmetric normalization.
The algorithm for detecting attackers is described as follows.
Algorithm 3 is divided into two parts.The first part (lines 1-8) uses GCN semi-supervised classification model training to get the classification result Z of all user nodes.The second part (lines 9-14) filters out the attack users according to the classification result Z.

Experimental Evaluation
where TP represents the number of attackers accurately recognized, FN represents the number of attackers mistaken for genuine users, and FP represents the number of genuine users mistaken for attackers.

Parameter Selection.
Figure 2 shows how the F1-measure of the KC-GCN is influenced by parameters θ and k on the Netflix and Amazon datasets.In Figure 2(a), the F1measure of KC-GCN is the highest for detecting the Wireless Communications and Mobile Computing GSAGen l Ran attack on the Netflix dataset under a θ value set to 0.01.Under a smaller θ value, the user relationship graph contains a large number of weak relationship edges, and the community structure is not obvious, leading to a decrease of the detection precision.At a larger θ, the user relationship graph shows an obvious community structure, but some user nodes are filtered from the graph, thereby degrading the detection recall.Moreover, k = 4 has a superior detection performance than k = 3; therefore, for the Netflix dataset, we set k to 4 and θ to 0.01. Figure 2(b) shows that when θ = 0:052 and k = 3, the F1-measure of KC-GCN is close to 0.8776 on the sampled Amazon dataset, which is the best.Therefore, we set k to 3 and θ to 0.052 for the sampled Amazon dataset.

Experimental Results and Analysis.
To verify the effectiveness of KC-GCN, we compare the precision, recall, and F1-measure of KC-GCN with the following methods.We assess the precision, recall, and F1-measure of KC-GCN in comparison to the following methods to confirm its efficacy.
(1) Catch the Black Sheep (CBS) [27]: this detection method uses label propagation to iteratively calculate the malicious probability of users and items, which needs the number of spammers and a certain number of seed users in advance.In contrast to the experiments, the number of seed users on the two datasets is consistent with that of our method (2) GAGE [19]: this is an unsupervised group shilling attack detection method based on graph embedding, which learns the low-dimensional vector representation of nodes in the user relationship graph using Node2vec and obtains attack groups through clustering.In the experiments, the working strategy is adjusted by setting parameters p = 7 and q = 0:2 and group size ðGSÞ = 30 (3) TP-GBF [20]: this is an unsupervised group shilling attack detection method based on the strong association between the group members and the group behavior features, which uses a topological potential-based community partition algorithm to  9 Wireless Communications and Mobile Computing generate tight subgraphs as candidate groups and cluster attack groups by group behavior features.In the experiments, parameter θ is set to 2, while parameter σ is set to 1 and 0.47 in the Netflix and Amazon datasets, respectively 4.4.1.Comparison of the Detection Results on the Netflix Dataset.Table 3 compares KC-GCN and three baseline methods to identify the group shilling attacks with tightly coupled shilling groups at various attack sizes on the Netflix dataset.In Table 3, the precision and recall values of the CBS remain stable for detecting three types of group shilling attacks, only slightly changing the attack size from 2.5% to 10%.The CBS detection performance is much lower than that of KC-GCN when detecting various types of group shilling attacks with tightly coupled shilling groups, albeit the number of attackers is assumed in advance.Meanwhile, the precision values of GAGE under three types of group attacks are the worst, indicating the misclassification of a large number of genuine users as attack ones.This happened because GAGE generates the user node feature vectors using Node2vec, from which a certain degree of randomness may cause some genuine and attack users to be divided into the same candidate group.The detection performance of TP-GBF is better than those of CBS and GAGE when detecting the group shilling attacks with tightly coupled shilling groups.The detection recall was not high under the GSAGen l AOP.Compared with CBS, GAGE, and TP-GBF, KC-GCN shows the best detection performance because it can extract more effective features when correctly differentiating attack profiles from genuine ones.KC-GCN uses a weighted graph to aggregate the neighbor features, thereby effectively avoiding the merging of user features with different labels.It can fully integrate the user node and structural features, further increasing the difference between attackers and normal users.In conclusion, KC-GCN outperforms the baselines for detecting various types of group shilling attacks with tightly coupled shilling groups at various attack sizes on the Netflix dataset.
Table 4 compares the performances of our proposed KC-GCN and three baseline methods in terms of detecting group shilling attacks with loosely coupled shilling groups at various attack sizes on the Netflix dataset.In Table 4, the precision and recall values of CBS under the three attack models significantly decrease when the relationship between users within the attack group is weakened.This indicates that improving the detection performance is difficult when relying only on the rating bias.The GAGE performance becomes better with the attack size increase, but its precision greatly fluctuates because it may falsely identify some normal users as attackers.TP-GBF shows an excellent detection performance under the GSAGen l Ran and GSAGen l Ave attacks but is less effective under the GSAGen l AOP attack.Its detection performance becomes extremely unstable with the change of the attack size.KC-GCN yields the best detection performance among the four methods.It shows a slight decline in detecting loosely coupled shilling groups mainly because the feature differences between the attackers and the genuine users are weakened with a looser relationship in a group.In conclusion, KC-GCN outperforms the baseline methods in detecting various types of group shilling attacks with loosely coupled shilling groups at various attack sizes on the Netflix dataset.
Figure 3 compares the detection results of the four detection methods under the GSAGen l GOAT attack on the Netflix dataset.In the Netflix dataset, the precision, recall, and F1-measure of CBS when identifying tightly and loosely coupled shilling groups are 0.6791, 0.8184, and 0.7422 and 0.6352, 0.7656, and 0.6943, respectively.The detection performance of CBS is constrained by the number and influence of seed users.These results also indicate that CBS can achieve superior detection performance when a closer relationship exists between the group members.The precision, recall, and F1-measure of GAGE for detecting the tightly and loosely coupled shilling groups are 0.4046, 0.6968, and 0.5119 and 0.3157, 0.9120, and 0.4690, respectively.These results indicate that GAGE is less effective on the Netflix dataset under the GSAGen l GOAT attack because the GOAT attack model uses the genuine user profile as a template to generate the attack profile, which is highly similar to the genuine user.However, the user node feature vector obtained by the Node2vec method cannot effectively distinguish genuine users and attackers.For TP-GBF, the precision, recall, and F1-measure of the tightly coupled shilling groups are 0.9905, 0.7269, and 0.8385, respectively, while those for the loosely coupled shilling groups are 0.7036, 0.5000, and 0.5846, respectively.TP-GBF shows an extremely high precision in identifying the tightly coupled shilling groups; nevertheless, the recall of TP-GBF is poor when identifying the loosely coupled shilling groups because it cannot distinguish weakly related attack groups.Figure 3 shows that GAGE and TP-GBF have poor performances when detecting attack groups generated based on GOAT because the profiles generated by GOAT are very similar to the genuine profiles.The precision, recall, and F1-measure of KC-GCN for detecting the tightly coupled shilling groups are 1, 0.9857, and 0.9928, respectively, while those for the loosely coupled shilling groups are 1, 0.9282, and 0.9628, respectively.These results show that KC-GCN is effective and outperforms the three baseline methods for detecting groups under the GSAGen l GOAT attack on the Netflix dataset.In other words, the feature differences between the attackers and the genuine users can be reinforced by using the weighted GCN to aggregate the user node features.
Figure 4 compares the detection results of the four detection methods under the GSAGen l Mixed attack on the Netflix dataset.In this dataset, the precision, recall, and F1measure of CBS for identifying the tightly and loosely coupled shilling groups are 0.8190, 0.9992, and 0.9002 and 0.8191, 0.9996, and 0.9004, respectively.CBS remains stable when detecting the tightly and loosely coupled group shilling attacks.GAGE shows precision, recall, and F1-measure of 0.9542, 0.9275, and 0.9407, respectively, for the tightly coupled shilling groups.For the loosely coupled shilling groups, the precision, recall, and F1-measure of GAGE are 0.8212, 0.9376, and 0.8759, respectively.Its detection performance significantly declines with the weakening user relationships.The main reason for this is that with the 10 Wireless Communications and Mobile Computing weakening user relationship in the group, its spatial structure changes, resulting in obvious changes in the initial user embedding and a significant decrease in the detection performance.The precision, recall, and F1-measure of TP-GBF for detecting tightly coupled shilling groups are 0.7209, 0.8204, and 0.7674, respectively, while those for loosely coupled shilling groups are 0.6620, 0.6141, and 0.6372, respectively.TP-GBF is less effective on the Netflix dataset under the GSAGen l Mixed attack.The precision, recall, and F1-measure of KC-GCN for identifying the tightly and loosely coupled shilling groups are 0.9978, 0.9430, and 0.9696 and 0.9583, 0.9705, and 0.9644, respectively.These findings demonstrate that KC-GCN is effective and outperforms the three baseline methods in terms of precision and F1-measure under the GSAGen l Mixed attack on the Netflix dataset.
Figure 5 shows the results of the four detection methods on the sampled Amazon dataset.The detection performance of KC-GCN is superior to that of the baseline methods on this dataset, yielding precision, recall, and F1-measure of .This indicates that KC-GCN can effectively combine user node and graph structure features to construct new user features by using GCN, which can distinguish genuine and attack users on the sampled Amazon dataset.The precision, recall, and F1measure of CBS are 0.6836, 0.8323, and 0.7507, respectively.This means that CBS can detect attack users on the Amazon dataset but that its detection performance is determined by the number of seed users.Meanwhile, GAGE exhibits 0.8004, 0.9277, and 0.8594 of precision, recall, and F1-measure, respectively.The result indicates that GAGE has a cer-tain randomness when sampling with Node2vec, which leads to a bias in the division of the candidate groups, and a precision measurement performance is lower than that of KC-GCN.The precision, recall, and F1-measure of TP-GBF are 0.9283, 0.6467, and 0.7623, respectively.This precision is not much higher than that of KC-GCN, but its recall is lower than that of KC-GCN, indicating that TP-GBF may have filtered out some attack groups with a low density.In summary, KC-GCN shows a superior detection performance over GAGE, CBS, and TP-GBF on the sampled Amazon dataset.

Conclusions and Future Work
In this work, we put forward a two-stage semi-supervised model to validly detect various types of group shilling attacks on recommender systems.First, we construct a user relationship graph and spot the influential users.
In the graph, the edge weight is calculated by analyzing the user similarity over suspicious time intervals on each item.Next, we generate the initial user embeddings based on the proposed four indicators describing the behavior difference between attack and genuine users.A GCN-based classifier is trained, and the attack users are detected based on the influential user labels.The experimental results prove the effectiveness and the generality of KC-GCN.
In the future work, we will automatically determine the labels of most influential users by further analyzing the structural properties of the weighted user relationship graph.We will also study the multiaspect data [37] to further help identify users of group shilling attack.

1 ( 4 )
and add the users in G i into the tightness community set TCS (2) Convert the TCS into an overlapping community matrix O, where the diagonal elements in the matrix O represent the number of users in the community, and the off-diagonal elements represent the number of shared users in adjacent communities(3) Merge the small communities in the overlapping community matrix O to obtain the community adjacency matrix A. In the matrix O, these diagonal elements with a value less than k and off-diagonal elements with a value less than k − 1 are set to 0, while the left elements are set to Generate the suspicious candidate group based on the community adjacency matrix A

Figure 2 :
Figure 2: The influence of parameters θ and k on the F1-measure of KC-GCN.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Comparison of the detection results of the four detection methods on the Netflix dataset under the GSAGen l GOAT attack.

Table 1 :
Comparison of different shilling attack detection methods.
[21]llent detection performance regardless of the number of target items Less effective under smaller-sized groups Zhang et al.[19]Automatic feature extraction Unstable detection results Yu et al.[20]Automatic feature extraction Less effective under a smaller group size Cai and Zhang[21]Effective for detecting tightly coupled shilling groups Less effective under smaller-sized shilling groups on the Amazon dataset

Table 2 :
Notations and their descriptions.
the group shilling attack model introduced in Section 2.1.Under GSAGen l Ave, GSAGen l Ran, and GSAGen l AOP, 10 attack groups are generated each time.The filler size is set to 2%, and the attack size is set to 2.5%, 5%, 7.5%, and 10%.The target items in each attack group are randomly selected from unpopular items.Two target item strategies are set (denoted as ST1 and ST2) to prove the influence of the relationship between the attack users in the same group on the detection performance.ST1 means that all attackers of the same group rate all the target items (number of target items in the experiments: 3).ST2 means that each attacker of the same group rate any three of the five target items.This results to 4 * 2 * 3 * 2 = 48 experimental datasets generated.We generate loose group attack profiles to verify the universality of the proposed method using the GSAGen l GOAT and GSAGen l Mixed attack models introduced in Section 2.1 and two target item strategies.The dataset generated based on the GSAGen l GOAT attack model specifically contains eight attack groups.The dataset generated based on the GSAGen l Mixed attack model contains 26 attack groups.For convenience of description, from 1 to 5, where 1 and 5 indicate disliked and most liked, respectively.We randomly sample 215,884 ratings and the rating time of 2000 users on 4000 movies for use in the experimental dataset.Similar to the previous research, the 2000 extracted users are regarded as genuine users.Multiple group attack profiles are generated and injected into the dataset by using Input: the weighted user's relationship graph G, the influential user's set IUS, the user initial embedding matrix X, and maximum training epoch K

Table 3 :
Comparison between KC-GCN and other detection methods for detecting group shilling attacks with tightly coupled shilling groups at various attack sizes on the Netflix dataset.

Table 4 :
Comparison between KC-GCN and other detection methods for detecting group shilling attacks with loosely coupled shilling groups at various attack sizes on the Netflix dataset.