Friend Recommender System for Social Networks Based on Stacking Technique and Evolutionary Algorithm

In recent years, social networks have made significant progress and the number of people who use them to communicate is increasing day by day. (e vast amount of information available on social networks has led to the importance of using friend recommender systems to discover knowledge about future communications. It is challenging to choose the best machine learning approach to address the recommender system issue since there are several strategies with various benefits and drawbacks. In light of this, a solution based on the stacking approach was put out in this study to provide a buddy recommendation system in social networks. Additionally, a decrease in system performance was caused by the large amount of information that was accessible and the inefficiency of some functions. To solve this problem, a particle swarm optimization (PSO) algorithm to select the most efficient features was used in our proposed method. To learn the model in the objective function of the particle swarm algorithm, a hybrid system based on stacking is proposed. In this method, two random forests and Extreme Gradient Boosting (XGBoost) had been used as the base classifiers. (e results obtained from these base classifiers were used in the logistic regression algorithm, which has been applied sequentially.(e suggested approach was able to effectively address this issue by combining the advantages of the applied strategies. (e results of implementation and evaluation of the proposed system show the appropriate efficiency of this method compared with other studied techniques.


Introduction
In recent years, by the development of social networks, individuals and organizations can easily interact with each other. People can get their favorite connections in different fields and share their knowledge as well. Most of the connections that individuals are making on social networks exist only in the virtual world and are not often accessible there. A social network is a graph in which each node stands for a person, group, or organization, and each link between nodes depicts the relationships among them. Understanding and describing the processes that create social interactions is one of the fundamental problems in social network analysis. In this regard, the problem of link prediction in social networks states that two nodes in a network will be connected in the near future or not [1].
One of the most important issues in social networks, which leads to its superiority over other networks, is the friend recommender system in it. In recent years, many methods were proposed to suggest a friend recommender system that had been used machine learning techniques and artificial intelligence [2,3]. e performance of its methods is not, however, clearly categorized in the area of machine learning. is is mostly because there are so many methods and suggested adjustments in the literature. As a result, choosing a proper machine learning algorithm is difficult and confusing that fits the needs of the issue when developing a recommender system. us, considering that each machine learning method has its advantages and limitations, an approach based on stacking technique is presented that can combine the advantages of machine learning methods and improve the results.
Before moving on to the next phase of this study, a review of earlier techniques is provided. e different parts of the research's suggested system are then discussed. e outcomes of the system's implementation are then examined using a number of tests, and conclusions are provided at the end.

Related Work
e problem of link prediction, which first raised by Getoor and Diehl, is presented as a problem of predicting the presence of a link between two entities. e prediction is based on the properties of other objects and other observed links [4]. is is widely used in a wide range of real and important areas, especially those involving the detection of complex events from highly structured data [5]. A thorough summary of link prediction in various networks was given in a research by Daud et al. e report gave thorough descriptions of link prediction algorithms, cutting-edge technology, programs, problems, and future research objectives. Besides, several directions for future research in the field of link prediction in social networks were expressed [6].
In a research by Chen et al. [7], the encoder-LSTM-decoder (E-LSTM-D) system is proposed as a new deep learning model based on stacked long short-term memory (LSTM) in the encoder-decoder architecture. e various experiments performed in this paper show that the E-LSTM-D model in different datasets performs significantly better than the existing dynamic link prediction methods. Behera et al. used potential future linkages using a variety of machine learning algorithms, including K-NN, MLP, bagging, SVM, and decision trees, based on attributes retrieved from the topological structure. e performance of the proposed system of this research was evaluated in terms of various criteria [8].
In a research by Chonghuan [9], to solve the problem of sparsity of recommender systems, a new recommendation method for social network using matrix factorization technique proposed. In this method, users clustered and various complex factors considered as well. e simulation results showed that the proposed socialized recommendation method based on matrix factorization (SRM-MF) system performed better than the methods available on the tested dataset. For instance, the precision of the real dataset and the Book-Crossing dataset are 0.088 and 0.095, respectively, assuming the Hamming distance is 20. While 0.073 and 0.086, respectively, represent the greatest accuracy for other procedures used in similar circumstances, Pecli et al.'s experiments were performed on three datasets (Microsoft Academic Network, Amazon, and Flickr) that included more than twenty different features, including topological features and domain-specific features. e program combines three feature selection strategies, six different classification algorithms (support vector machines (SVM), k-nearest neighbors (K-NN), simple Bayesian, Classification and Regression Tree (CART), random forest, and multilayer perceptron) and three evaluation criteria (precision, F-measure, and area under the curve). eir research's findings revealed an intriguing relationship between the majority of the chosen characteristics and the dataset. e findings demonstrated that using feature selection techniques to condense the feature set produces better classification models than classifiers built using the whole set of features [10]. In the paper by Manshad et al., a new time series link prediction (TSLP) method based on irregular cellular learning automaton (ICLA) and evolutionary computation (EC) proposed. ICLA-EC had been used to analyze network evolution through neighborhood dynamicity. Based on experiments performed on different datasets, ICLA-EC-TSLP achieved significant results (0.7212-0.8650) in AUC criterion compared with other methods [11].
In Cai et al.'s research, a new link prediction model based on line graph neural networks is proposed that achieves good performance for the link prediction problem. Studies on 14 datasets revealed that the suggested approach of this study consistently outperformed all fundamental techniques in terms of area under the curve (AUC) by identifying more relevant features [12]. In the research of Parveen et al., the friends' recommendation system performed using different types of machine learning algorithms, such as Random Forest Classifier, XGBoost, Light GBM, and Cat Boost. e performance of the mentioned methods compared in F1score criteria, accuracy, recall, and confusion matrix. e results of this study showed that Random Forest and Light GBM are less accurate than the XGBoost and CatBoost algorithms.
e accuracy of the XGBoost and CatBoost algorithms was the same and equal to 95% [13].
Kumar et al. proposed a friend recommendation system that uses a random forest to advise a buddy. e data collection used in this study has 94,000 nodes. e achieved accuracy for this suggested model is 89%. It is stated that the accuracy obtained in relation to the available hardware and data volume is quite reasonable [14]. In the research of Murali et al., a recommendation system presented in which each user is offered the best research articles in this field. is recommendation method is based on the individual queries and similarities found from other users based on their queries.
is recommendation system uses a collaborative filtering approach and helps to avoid user time-consuming [15].
In the research done by ZhengWei et al. [16], a solution based on XGBoost is proposed for classifying and recommending journals to researchers. e doc2vec is used to get better results. e accuracy of this method was measured at 84.24 percent after testing on Common SCI English publications in the computing industry to verify the findings. A unique Graph Neural Network for Reciprocal Recommendation (GraphRR) was suggested for exploiting multiplex user interactions in the research written by Chang et al. [17]. To display each user's preference, attraction, and likeness, three ego graphs are created for each user depending on the directions of interaction. en, multiplexity-aware GNN modules are applied to measure participation. Extensive tests were conducted on large-scale real-world online gaming datasets from NetEase Games, which demonstrated the system's good performance.
As it is clear from the reviewed researches, recommender systems used in different fields and have of special importance. Studies shown that using machine learning techniques in this field is high and could be developed due to the nature of artificial intelligence methods. In Table 1, a summary of the reviewed related works is given.

Proposed System
is paper aims to propose a new way as an effective approach based on the use of particle swarm optimization and ensemble methods for friend recommendation in social network. Details of the proposed system are shown in Figure 1.
e proposed system of this research uses particle swarm optimization algorithm to select effective features. Each particle in this method represents a set of characteristics that iteratively progress toward the best answer. In this algorithm, a particle that chooses the optimal attributes for the issue is ultimately chosen. e method used to learn the objective function of the particle swarm optimization algorithm involves a stacked model of several machine learning algorithms.
ere are several ways to learn the model, and usually each of them has power in a particular area, and to use one of them, the necessary studies must be done to understand how they work. ere is no algorithm that is always the best, and each approach performs differently depending on the data and the situation. is is a key concept in these algorithms and models. In these situations, combining the output of many classifiers is preferable than selecting a specific approach or technique. Since each of them may have different strengths and weaknesses, it is expected that their participation would have a good compensatory effect.
One of the best and most effective combining methods is the use of stacking technique in which multiple models are combined. is method is used to increase the accuracy of models and improve results and reliability in a wide range of business and research programs. Stacking is a learning-based method, which combines multiple classification models or regression models. ere are two stages to the categorization process based on this model. e output of the first level of classifiers serves as the input for the classifiers at the second level in the stacking approach. In other words, it can be said that classifier prediction at one level is considered as a feature for the next level classifier.
In this regard, in the proposed method of this research, XGBoost methods and two forms of random forest are implemented for the first level. e results of the first-level algorithms are used in logistic regression, which is implemented in the second level of the proposed system. Finally, the obtained results are used as the fitness result in the objective function of the particle swarm optimization algorithm.

Data Preparation.
e link prediction collection has been used.
is collection could be find at Noesis, nd. is collection of information has collected 22 networks from different sources and fields. is dataset includes a wide range of different features and information. A summary of the information in these datasets is given in Table 2. Name of the network, number of nodes, number of edges, average degree, average clustering coefficient, and average length of shortest route, dimension, heterogeneity, and assortativity are all listed in the table from left to right [21]. BUP dataset and a few additional datasets were used to test the proposed system. e BUP dataset represents the network information of political blogs. is dataset includes 105 nodes, 441 links, and 8.4 degrees ( Table 2).

Complexity
Each instance, x i,n , of the data is transformed into x′ i,n as follows: where μ and σ denote the mean and standard deviation of ith feature, respectively [22].

Feature
Engineering. Nod2vec and NetworkX packages are used for feature engineering in this research. One of the introduced solutions for selecting features from a graph is known as Node2Vec. Node2vec is a exible neighborhood sampling strategy that allows us to gently interpolate between BFS (Breadth First Search) and DFS (Depth First Search). is method is implemented by developing a biased exible random walking method that can explore neighbors in both BFS and DFS methods [20].
A random walk is de ned by two parameters p and q. We assume that the current random walking position is node v. e position of the previous step is node t. In order to determine the next position, the probabilities of π vx transfer at the edges (v, x) leading to v must be evaluated. We set the probability of anomalous transfer to π vx α pq (t, x)·w vx . In particular, α pq is de ned as follows: where d tx de nes the shortest distance between node t and node x, and the value of d tx must be 0, 1, or 2. e p parameter controls the possibility of revisiting a node during a random walk. When the p value is high, the visited nodes are rarely sampled. is strategy promotes moderate exploration and eliminates redundant sampling across two hops. Alternatively, if p is small, the walk is directed backward by one step (Figure 2), keeping it "local" and near to the initial node u. e q parameter allows the search to distinguish between "local" and "global" nodes. As shown in Figure 2, if q > 1, a random walk is more likely to be sampled from nodes around the node. BFS samples the nodes in a small location. Conversely, if q < 1, the random walk is farther away from v, which can receive more general information about the features. erefore, the distance between the sampling node and the given source node does not increase strictly [21].
Recently, huge amounts of network data in various elds such as web pages, social networks, and power grids are being generated and collected. NetworkX package was created in April 2005 to analyze these massive and complex networks in Python [22]. is Python package is intended for building, modifying, and researching the composition and operation of complicated networks. A variety of networks or diagrams are shown using its data structures. In contrast to many other technologies, NetworkX is extremely versatile and built to handle data at a scale appropriate for contemporary issues. In this package, nodes can represent any object in Python, and edges can contain arbitrary data. In Figure 3, a graph plot of BUP data that created by NetworkX is shown. is gure shows the nodes and how they (edges) are connected.

Data Splitting.
To implement and evaluate the eciency of the proposed system, the data studied in this research are divided into two segments: training data and test data. In this classi cation, 70% of the total data examined is used for system training. To evaluate the system, the remaining 30% is considered as test data.

Classi cation Model.
In these methods, classi ers had been combined to produce better predictions compared with single-level models. To do this, the stacking technique is used to implement several consecutive classi ers. As mentioned in the proposed system, several XGBoost is execution technique to increase the accuracy and performance of the Gradient Boosting Machine (GBM) and especially to increase the classi cation accuracy of regression trees proposed in [23], and two random forest algorithms have been ran in the rst level, and nally, logistic regression model is one of the statistical tools used for data analysis, in which the relationship between a dependent variable and independent variables de ned based on a series of observational values [24] and in the second level combined with them sequentially. e results of the rst-level classi ers combined as the input of the second level and the nal prediction in the second level are based on the results obtained from the rst level. e current position in a random walk is at node v, and the previous step is at node (t). In this example, x1, x2, and x3 are neighbors. e values of a pq are calculated based on the distance between v and t. [21].

Complexity
Because of the regular and parallel processing, XGBoost outperforms GBM. is approach integrates all predictors simultaneously for enhanced training [25]. e XGBoost algorithm is a system that successively generates decision trees.
is algorithm can perform calculations relatively faster than all computing environments. XGBoost is widely used in modeling and classification for its performance.
Many decision trees grow in the classification of random forest algorithms, which is a batch algorithm. A decision tree algorithm can easily perform classification operations on events. e random forest algorithm uses several decision trees. In fact, a set of decision trees together produce a forest, and this forest can make better decisions (than a tree). In general, the decision tree is prone to overtraining and has little generalizability. e volatility of the decision tree's findings in the presence of noise in the input data is another drawback. A slight shift in learning patterns during the construction of a decision tree may result in significant changes to the tree's structure. Random forest, which operates by averaging the outcomes of all decision trees, is used to tackle these issues. e most important feature of stochastic forests is their high performance to measure the importance of variables, thus determining the role of each variable in predicting the response [26].

Feature Selection.
In solving many problems, machine learning methods have difficulty in dealing with a large number of input features. One of the most crucial strategies in data preparation and feature selection is crucial for the efficient and accurate use of machine learning technologies. One of the important steps in the machine learning process is feature selection. is process identifies relevant features and removes irrelevant and additional data [27]. is process speeds up data mining algorithms, improves prediction accuracy, and increases comprehensibility. Irrelevant features are those that do not provide any useful information, and additional features do not provide more information than currently selected features.
In the proposed method of this research, particle swarm optimization algorithm is used to select the features. In this section, more relevant features are selected so that the performance of the friend recommender system is improved. e particle swarm optimization algorithm is a social search algorithm based on the social behavior and regular collective movements of birds and fish [28]. Despite the limited ability of each particle to find the best pattern, their collective behavior has a great ability to find the best path (in other words the best answer to optimization problems) as the position of each particle changes based on the particle's experience in previous movements and neighboring particle experiences. In fact, each particle is aware of its superiority or lack of superiority over neighboring particles as well as the whole group.
Two perspectives were considered to model the order in the collective movement of these particles. One dimension is the social interactions between group members, and the other dimension is the individual superiority that each group member may have. In the first dimension, all members of the group are obliged to always change their position by following the best person in the group. In the second dimension, it is necessary for each member to keep in their memory the best situation they have personally experienced and to have a tendency toward the best perceived situation of their past. Each of these members may become the leader of the group so that the other members have the duty to follow them.
After generating the initial population (particles) and considering an initial velocity for each particle, the fitness of each particle is calculated based on its position. Each particle in the search space represents one solution for the problem and changes its speed based on the best answer obtained in the particle group (best person in the group) and the best place that it has ever been. is velocity is added to the position of the particle, and a new position of the particle is obtained. In subsequent iterations, the best particle in terms of fitness helps the other particles and corrects their motion, and after successive iterations, the problem will converge towards the optimal answer. e position vector for the ith particle with dimension d is X i � [x i,1 , x i,2 , . . . , x i,d ]. e velocity vector is defined as During motion, the best position that each particle can reach during the execution of the algorithm is called pbest � [p b,1 , p b,2 , . . . , p b,d ], and the best position that all particles have gained during the execution of the algorithm is called gbest � [g b,1 , g b,2 , . . . , g b,d ].
e position and velocity vectors of each particle are defined as follows: Figure 3: A graph plot of BUP data that created by NetworkX. 6 Complexity c 1 is learning coefficient related to personal experiences of each particle and c 2 is learning coefficient related to group experiences. e rand 1 is random number between [0 1]. e w t is a control parameter that controls the effect of the current particle velocity on the next velocity and creates a balance between the algorithm's ability to search locally and globally.

Results
In this section, the results of the implementation of proposed system reviewed. To do this, the performance of the friend recommender system has been examined in 5 different modes. In these 5 modes, the XGBoost, first Random Forest, second Random Forest, Logistic Regression, and the proposed system were used as learning models. First, some of important parameters considered for different methods are stated in Table 3.
e results of this implementation are given in Table 4. e values obtained for precision, recall, F-measure, and accuracy criteria for the mentioned methods for class one are given in this table. Based on the results of this table, the values of the criteria if the proposed system is used are 0.68, 0.70, 0.69, and 0.69, respectively. ese findings demonstrate that if the suggested stacking strategy is used, the best results are achieved for every analyzed criterion. Better outcomes than individual base models are attained in the proposed system because basic learning models are stacked and their unique capabilities are used. e confusion matrix obtained from the stacking for the train and test data is given in Figure 4. Each column of the confusion matrix represents a sample of the predicted value, and each row contains actual sample. To classify two classes, each member sample will be either positive or negative. erefore, for each data sample, four states may occur that are represented by the confusion matrix. e sample is a member of a positive class and is recognized as a member of the same class (true positive). e sample is a member of the positive class and is recognized as a member of the negative class (false negative). e sample is a member of a negative class and is recognized as a member of the same class (true negative). Finally, the sample is a member of the negative class and is recognized as a positive class member (false positive).
Since the elements on the main diameter show the correct samples (true positive and true negative), as shown in Figure 4 for the training dataset, their sum is equal to the total number of samples. e values of the elements on the subdiameter show the incorrect samples (false negative and false positive), which are zero for the training dataset. erefore, the confusion matrix for training data showed the highest possible performance. e confusion matrix findings for the test data show that there are not many false positive instances or false negative cases, which is acceptable. is demonstrates how well the suggested stacking approach works in both classes.

Results for Other Datasets.
In this section, the proposed method on other datasets was also examined. ese datasets are INF, CEG, and UAL, respectively. e results e number of trees in the forest � 20 measures the quality of a split � "gini" e number of features to consider when looking for the best split � "sqrt (n_features)" RandomForest_2 e number of trees in the forest � 30 measures the quality of a split � "gini" e number of features to consider when looking for the best split � "sqrt (num of features)"

Logistic regression
Solver � "lbfgs" penalty term � "L2" Tolerance for stopping criteria � 1e-4 of applying di erent methods to INF dataset are listed in Table 5. e results of applying di erent methods to CEG dataset are listed in Table 6. e results of applying di erent methods to UAL dataset are listed in Table 7.
As can be observed, the suggested method performs rather well on the investigated datasets. e ndings from the INF and UAL datasets demonstrated that the suggested system of this study outperformed the alternative algorithms in every analyzed criterion. In these two experiments, Random Forest_2 had the best performance after the proposed system. In testing the methods on CEG dataset, Random Forest_2 performed better in the precision and accuracy criteria, and the proposed system performed better in the recall and F-measure criteria. However, due to the fact that in the case of this study, overlooked cases (false negatives) are more costly than false alarms (false positive), and recall is more important than other criteria. erefore, like the other two datasets, the proposed system of this research performs better on this dataset.

Research Limitations.
Because a portion of the methodology utilized in this work is based on stacking several machine learning techniques, training the system takes a disproportionately long amount of time. e training procedure for huge graphs might take a long time if hardware resources were limited. For future work, according to the stated point, one can focus on reducing system training time. Applying intelligent sampling methods and using a subset of data for the training process can be considered.

Conclusion
In this research, a friend recommender system based on a combination of XGBoost, random forest, and logistics regression techniques is proposed. e results of this approach's implementation of XGBoost methods and two types of random forests were integrated using the stacking method and the logistics regression algorithm. e particle swarm optimization algorithm in this method chooses the most e cient characteristics to achieve the highest e ciency. For better investigation, in addition to the proposed stacking system, XGBoostbased system, linear regression, and random forest were implemented e results of this comparison showed that the proposed stacking system can achieve higher precision, recall, F-measure, and accuracy than other implemented approaches. is system has been able to make good diagnoses in both existing classes and achieve good results.

Introducing the Tool
Python programming language is used to implement the proposed solution. Python is a powerful programming language that is easy for people to learn. High-level data structures in this programming language are very efficient, and object-oriented programming is made possible. Python Interpreter and the extensive standard library are freely available on all major platforms on the Python website [29]. For the Python programming language, there are several libraries on machine learning and data mining issues. ese features have led to the widespread use of this language in the field of artificial intelligence.