Hybrid Self-Adaptive Algorithm for Community Detection in Complex Networks

The study of community detection algorithms in complex networks has been very active in the past several years. In this paper, a Hybrid Self-adaptive Community Detection Algorithm (HSCDA) based on modularity is put forward first. In HSCDA, three different crossover and two different mutation operators for community detection are designed and then combined to form a strategy pool, in which the strategies will be selected probabilistically based on statistical self-adaptive learning framework. Then, by adopting the best evolving strategy in HSCDA, a Multiobjective Community Detection Algorithm (MCDA) based on kernel k-means (KKM) and ratio cut (RC) objective functions is proposed which efficiently make use of recommendation of strategy by statistical self-adaptive learning framework, thus assisting the process of community detection. Experimental results on artificial and real networks show that the proposed algorithms achieve a better performance compared with similar state-ofthe-art approaches.


Introduction
Since many complex systems, such as the Internet, social networks, and biological networks, can be modeled as complex networks, the study of complex networks is essential to better understand and analyze such systems.In complex networks, community structure [1] refers to the node groups which have the feature that connections between the nodes in the same group are dense and connections between different groups are sparse.In addition to the properties of small world, scale-free, and high clustering coefficient, community structure is another important feature of complex networks.Community detection [2] (also known as network clustering, graph clustering) is to find a division of nodes to obtain community structure.Community detection is helpful to better understand the topology and functions of complex networks [3].For example, mining community structure on the Internet can not only improve the web search results and enhance the user experience but also implement the hot topic tracking system.Obtaining the community structure of social networks can help to find social circles with the same hobbies.Therefore, it is essential to further study the community detection in complex networks.
Fortunato [2], Schaeffer [4], and Newman's [5] articles provide a good overview on the community detection approaches.The community detection approaches include traditional methods, divisive algorithms, and modularitybased methods.As one of the most popular methods, the modularity-based methods have attracted many researchers' attention, with the most characteristic feature of converting the network clustering problem into an optimization problem by maximizing the modularity  presented by Girvan and Newman [6].With the increase of network size, calculating the communities with maximal modularity is NP-hard [7].
Therefore, heuristic and intelligent optimization algorithms are often used to tackle the problem.For example, GN algorithm [6] generated candidate solutions by using heuristic operations, such as moving a node to other communities, 2 Mathematical Problems in Engineering switching nodes in different communities, decomposing or merging communities, and then selecting the best community division by calculating  values with simulated annealing Metropolis criterion.FN [8] algorithm is a community detection algorithm proposed by Newman in 2004; the basic idea is to use greedy optimization algorithm to maximize  value.BGLL algorithm [9] utilized the network topology and modularity to compute the division of community structures; the time complexity of the algorithm is proved to be linear.
Small-scale communities cannot be detected from some large complex networks by optimizing the modularity, which is the resolution limit [10].To conquer the limit, a number of modified modularity measurements, such as modularity density [11], kernel -means (KKM), and ratio cut (RC) [12], are developed and introduced into the objective function, which promotes a group of new detection methods based on multiobjective optimization.Recent multiobjective optimization algorithms for community detection include MOGAnet [13], MOCD [14], and MOEA/D-net [15].MOGA-net algorithm uses community fitness (CF) and community score (CS) as two objectives to be optimized to solve community detection problem.MOCD algorithm employed PESA-II [16] to optimize the objective functions intra and inter and used two methods (max  and max ) to select suitable solutions from Pareto dominant solution set.MOEA/D-net algorithm employed MOEA/D to optimize negative ratio association (NRA) and ratio cut (RC) [15] to find dominant solutions.
The above work shows that the modularity-based intelligent optimization algorithms for community detection attract much attention of researchers.In order to further improve the performance of intelligent optimization algorithms for community detection, the paper proposes a new framework including hybrid evolving strategies and adaptive learning mechanism based on evolutionary algorithm.The work includes two parts.In the first part, the modularity  is used as the objective function because of its simplicity and easy understanding.A Hybrid Self-adaptive Community Detection Algorithm (HSCDA) based on modularity is put forward.In HSCDA, three different crossover and two different mutation operators for community detection are designed and then combined to form a strategy pool, in which the strategies will be selected probabilistically based on statistical self-adaptive learning framework.Experimental results show that HSCDA is able to achieve competitive modularity compared to other modularity-based algorithms, GN, FN, and BGLL.In the second part, a Multiobjective Community Detection Algorithm (MCDA) is proposed, in which KKM and RC are used as two optimization objectives instead of the modularity.The primary evolving strategy of MCDA is decided by the self-adaptive learning framework in HSCDA.Pareto mechanism is used to preserve the good solutions.Experimental results show that MCDA achieves a better performance compared with HSCDA and other multiobjective based algorithms, MOGA-net, MOCD, and MOEA/D-net.
The rest of this paper is organized as follows: Section 2 gives the problem statements.In Section 3, the proposed algorithms for community detection are presented.In Section 4, the performances of the proposed algorithms are validated on both computer-generated networks and real world networks.We also compare our algorithms with other approaches.The conclusions are finally summarized in Section 5.

Network Community Detection Problem
Assume that a network  is defined as  = (, ), where  denotes the node set and  denotes the edge set.The topology of the network is usually represented by adjacent matrix  = (  ).The elements in the matrix are 0 or 1.   = 1 indicates that the nodes  and  are connected, whereas   = 0 represents the nodes  and  are unconnected.
Community structure is a universal property of many complex networks in real world.The community is the node subset, which has a relatively tight connection between the inner nodes and a relatively sparse connection between the external nodes [17].Since the concept of connection is not clearly defined, there are many ways to measure community structure.
The modularity  proposed by Girvan and Newman is the most popular measurement [6]. is defined as follows: where  is the total number of edges in the network,   is the adjacent matrix of the network, and   is the degree of node ; if the nodes  and  are in the same community, (, ) = 1; otherwise it is 0. If the value of  is bigger than 0, the community structure begins to appear in complex networks.If  value is greater than 0.3, there is a clear community structure in complex networks.If  value is close to 1, community structure is more obvious.In the real world complex networks,  value is usually between 0.3 and 0.7.The advantage of modularity is easy understanding and lower computational cost.The problem of community detection based on modularity is an optimization problem by maximizing modularity .
In order to solve problem of limit of modularity resolution [10], Li et al. in [11] introduced a new objective function, the modularity density , which is defined as where popular decompositions is KKM [12] and RC [12], which are defined as follows: The smaller the KKM value is, the closer the internal group will be, and the smaller the RC is, the sparser the links between nodes of internal and external community will be.Therefore, community detection problem can also be modeled to a multiobjective optimization problem by minimizing KKM and RC.

Description of Proposed Method
In this section, the detailed information of HSCDA and MCDA is depicted.

HSCDA.
In order to further improve the solution quality of intelligent optimization algorithms for community detection problems based on modularity, HSCDA is proposed based on evolutionary algorithm.In HSCDA, three different crossover and two different mutation operators for community detection are designed and then combined to form a strategy pool, in which the strategies will be selected probabilistically by roulette wheel selection based on statistical self-adaptive learning framework.The flow of HSCDA is shown in Algorithm 1.

Individual Encoding. A partition Ω of the network
is encoded as an integer string x = { 1 ,  2 , . . .,   }, where x denotes an individual,  is the number of nodes in the network, and   is community label of node ,   ∈ {1, 2, . . ., }.Nodes with the same community label are considered in the same community.Note that a network of  nodes can be divided into  communities at most; in this case, each node consists of a community, which can be denoted as {1, 2, . . ., }.Moreover, there are many different representations corresponding to the same partition.For example, given a network of 4 nodes, {2, 1, 3, 2} and {1, 3, 2, 1} represent the same partition {{1, 4}, {2}, {3}}; that is, nodes 1 and 4 belong to the first community, node 2 belongs to the second one, and node 3 belongs to the third one.This direct encoding mode can be easily used without knowing the additional information such as the size of community structures in advance.

Population Initialization Algorithm Based on Label Propagation Mechanism.
To both reduce the searching space and promote diversity, the paper adopts initialization mechanism based on label propagation [12], which makes full use of prior knowledge of network topology to generate a population that densely connected nodes have a unique label.
Assume that the neighbor set of a node  is () = ( 1 ,  2 , . . .,   ) and let () be the label of node .In label propagation mechanism, the label of each node depends on the label with biggest proportion of labels in its neighbor set; it is defined as follows: where  represents the community labels of nodes in ().
If () and  are the same labels, then ((), ) equals 1 and otherwise 0. After label propagation, densely connected nodes can be set as the same label quickly.Algorithm 2 shows the flow of initialization algorithm using label propagation.improve the quality of solution, six different strategies for community detection are designed to make up the hybrid evolutionary strategy pool.Every evolutionary strategy includes crossover and mutation operators.Individual chooses different strategies adaptively and then gradually improves its solution structure.
Given two individuals   and   , three different crossover and two different mutation operators are designed as follows.
Crossover 1 Is Block Crossover.Two positions of  and  (1 ≤  ≤  ≤ ) are randomly selected at first.Then, the labels (from 1 to a and from ( + 1) to ) are selected from   to replace the labels of the same position of a new individual   , while the labels in the other position (from ( + 1) to ) in   are set to the same as in   .In the same way, the labels (from 1 to  and from ( + 1) to ) are selected from   to replace the labels of the same position of a new individual   , and the labels in the other position (from ( + 1) to ) in   are set to the same as in   .This process will generate two offspring individuals   and   .Crossover 3 Is a Two-Way Crossing Over [18].Firstly, randomly select two nodes called V  and V  and ensure that their labels    and    are different.Then let all of nodes belonging to these two communities in   be set as the corresponding communities in   to generate a new individual   .Meanwhile, the nodes V  , V  in   are found out with different labels, and make sure that the corresponding nodes in   are set as belonging to these two communities.Thus, the new individual   will be generated.This operator is an extended version of Crossover 2 and will also generate two offspring individuals   and   .
Mutation 1. Firstly, get the community structure according to the labels of nodes of an individual.Secondly, select a node in each community randomly and then change the label of this node into the label of one of its neighbor nodes.Mutation 2. Firstly, get the community structure according to the labels of nodes of an individual.Secondly, select a node in each community randomly and then change the label of this node into the label of its neighbors which has the highest duplication.If the labels of neighbor node are different from each other, then randomly select a label from neighbor nodes to assign.
Combine the above three crossover and two mutation operators mutually and thus generate the following six evolutionary strategies to form the hybrid evolutionary strategy pool: 3.1.4.Self-Adaptive Learning Framework.Based on strategy pool, a statistical self-adaptive learning framework is introduced into HSCDA.The individual adaptively chooses the appropriate strategy in different stages of the algorithm depending on the evolution effect of the strategy.In the self-adaptive learning framework, each strategy is given the corresponding probability of being selected.Individual selects evolution strategy by roulette wheel selection.
In particular, each individual  ( = 1, 2, . . ., ) has a selective probability vector   for strategy,   = [ 1 ,  2 , . . .,   ], where   means the probability of which th individual chooses th strategy from all  strategies in the hybrid strategy pool. is 6 in the paper.
The difference of the individual before and after evolving by a strategy is used to measure the evolution effect of that strategy, which is defined as follows: where  new is the modularity of the individual . old represents the modularity of the individual  in last generation.
best denotes the best individual in current population.Then, the change quantity of the probability is defined as follows: where rand is a random value in (0, 1) used to make a disturbance to avoid learning too fast.Suppose that individual  selects th strategy by the roulette wheel selection from the strategy pool with the probability of   (old); then the selective probability of individual  in the next generation will be updated to   (new), which is calculated as follows when  new −  old > 0: If  new −  old ≤ 0,   (new) is calculated as follows: However for other strategies, the selective probability   ̸ = (new) should be updated to make sure Individuals in the next generation will make a choice of the evolving strategies according to the updated selective probabilities.Therefore, HSCDA can make the individual adaptively choose the appropriate strategies at different stages.

Local Search.
In order to improve convergence speed and alleviate trapping into local optima, the hill-climbing method suggested in [18] is adopted here as a local search mechanism.Hill-climbing method is a kind of optimization method commonly used in local search, which usually starts from an arbitrary solution of current problems and tries to change an element of this solution to find a better solution.Once this change produces a better solution, then the new solution replaces the selected solution.The process is repeated until there is no better solution to be produced or reaching the stopping criteria.It is worth noting that the hillclimbing method is only for the individual which has the best fitness value, so as to avoid excessive amount of calculation.

MCDA.
Experiments (see Sections 4.2 and 4.3) show that the effect of the community structure detection algorithm based on the optimization of modularity is not good for the real network clustering.In order to further improve the solution quality, MCDA is proposed.In MCDA, strategy 6 with the largest proportion of selection of the best individual in HSCDA is considered as the strategy of MCDA; KKM and RC are set as two objective functions.The reason to adopt single strategy instead of adaptive framework based hybrid strategy pool is that individuals have to compare with each other to calculate chosen probability of evolving strategy in self-adaptive learning framework, while Pareto mechanism in MCDA cannot make a definite decision of which is good or poor between any two individuals.The same reason leads to the fact that the local hill-climbing search cannot be introduced into MCDA directly.The specific flow of MCDA is shown in Algorithm 3. [20] is commonly used to estimate the similarity between the true clustering results and the detected ones.Two vectors,  and , are inputted during the process of comparison.th bit of the vector represents the class of th node.The NMI(, ) is then defined as follows:  HSCDA is applied to four real networks, respectively; the average of optimal solutions of HSCDA after running 30 times is recorded.Table 2 lists comparison results between HSCDA and GN, FN, and BGLL algorithm in terms of NMI, where the results of GN, FN, and BGLL are taken from [25].As seen from the table, the NMIs of HSCDA are superior to other three algorithms except that NMIs are the same as BGLL in Football and Polbooks.Table 3 shows the comparison results of  values of HSCDA, GN, FN, and BGLL; we can find that  values obtained from HSCDA are higher than the other three algorithms.This is because adopting hybrid evolution strategies based on self-adaptive learning framework can improve solution quality of HSCDA.Community structures calculated by HSCDA on four real networks are given in Figure 1.Results of Tables 2 and 3 and Figure 1 show that HSCDA is more accurate than GN, FN, and BGLL.

Analysis of Evolution Effect of Strategies in Self-Adaptive
Learning Framework.To analyze the actual evolution effect of evolving strategy in hybrid strategy pool, the selected count of each evolving strategy of the optimal solutions (run 30   times independently) is recorded and shown in Figure 2. As shown in Figure 2, the selected proportion of strategy 6 is the highest in all strategies, which means that the evolution effect of strategy 6 is superior to others when dealing with the community detection problem.
From the results in Tables 2 and 3 and Figure 1, it is shown that HSCDA is superior to other methods based on modularity.However, according to the results in Tables 2 and  3, the improvement of  is not in accordance with NMI; that is, for Football and Polbooks,  value of HSCDA is superior to BGLL while NMI is the same as BGLL.The reason of the phenomenon is that  cannot fully disclose the essential of natural group in real networks.To improve the cluster effect, we further propose MCDA.In MCDA, strategy 6 is considered as the strategy of MCDA and KKM and RC are set as two objective functions.The reason to adopt single strategy instead of adaptive framework based hybrid strategy pool is that individuals have to compare with each other to calculate chosen probability of evolving strategy in self-adaptive learning framework, while Pareto mechanism in MCDA cannot make a definite decision of which is good or poor between any two individuals.The same reason leads to the fact that the local hill-climbing search cannot be introduced into MCDA.The experimental results and analysis are detailed in the next section.

Experimental Results and Analysis of MCDA.
The parameters of MCDA are set as follows: population size is 100, crossover probability is 0.9, mutation probability is 0.1, and maximum number of iterations is 100.MCDA and three multiobjective algorithms (MOGA-net [13], MOCD [14], and MOEA/D-net [15]) are compared in experiments on artificial synthetic network and four real world networks, respectively.The results show that MCDA has better solution accuracy and obtains true network clusters in several real networks.

Experimental Results and Analysis on Artificial Synthetic Network.
In order to compare with other community detection algorithms based on multiobjective optimization, we do experiments on artificial synthetic benchmark network proposed by Lancichinetti et al. [26], which is an extension of classic GN benchmark network proposed by Newman [6].The network contains 128 nodes which are divided into four communities of 32 nodes each.The average degree of each node is 16.The proportion of outdegree of the node is controlled by mixing parameter.The network becomes vaguer when  increases, which means that it is harder to figure out the true clusters on this occasion.
By adjusting values of mixing parameter  in the synthetic network, 11 networks in which mixing parameter  changes from 0 to 0.5 with interval 0.05 are generated to test the algorithm.NMI is used to measure the similarity between true network clusters and test results.For each network, we calculate average of the biggest NMI value after the algorithm independently running 30 times.Figure 3 shows the curve of NMI obtained from four different algorithms.
In Figure 3, we found that when 0.1 <  < 0.35, MCDA and MOEA/D-net can find the true network clusters (NMI is 1), while the NMI value of MOGA-net and MOCD declined obviously.When 0.35 <  < 0.45, all the algorithms fail to obtain the true clusters, but the NMI of MCDA is still higher than 0.8, which shows that MCDA outperforms other three algorithms when dealing with the vaguer networks.When  = 0.5, the effect of all algorithms was poor, and it is reasonable since the community structure is fully fuzzy at present.It can be seen from Figure 3 that MCDA has a better performance in most cases (0 <  < 0.48) compared with MOGA-net, MOCD, and MOEA/D-net, which is the benefit of the good solution space searching ability of strategy 6 for community detection.

Simulation Results and Analysis of Real Networks.
MCDA is applied to four real world networks mentioned above.Cluster results with max  and max NMI are shown from Figures 4 to 7. Figure 4 shows results of Zachary's Karate Club network, Figure 5 shows results of Dolphin social network, results of American college Football network are shown in Figure 6, and results of Books on US politics are shown in Figure 7.
From Figure 4(a), it is clear that MCDA can successfully detect the true community structures (corresponding to NMI = 1).Some nodes in Football network are not connected with nodes in the same community, while the connection between nodes of this community and nodes of other communities is more close.When the network is in the real clustering, the modularity  is −0.0239, which is much less than  value obtained by the algorithm.It shows that the true clusters are not completely complying with network community cluster rule.Because of the complicated structure, it is difficult to completely detect its real cluster.According to the cluster results from Figure 6(b) with max , MCDA obtains 10 clusters.We observed that some nodes like 12, 25, 29, 37, 43, 51, 59, 60, 64, 70, 81, 83, 91, 98, and 111 are misplaced.Figure 6(a) shows community structures detected by MCDA with NMI = 0.9269; it still has a good reference value because of the high NMI.
Similar to Football network, Books on US politics network itself shows high complexity.From the comparison of Figures 7(a) and 7(b), although part of nodes is misplaced and real clusters cannot be completely detected, it can still make NMI be 0.6283 and  be 0.5264, which is meaningful in terms of solution precision.

Conclusion
To further improve the solution quality of intelligent optimization algorithms for community detection, HSCDA and MCDA are proposed based on evolutionary algorithm, respectively.In HSCDA,  is set as the objective function and six different evolution strategies are designed to construct hybrid evolution strategy pool.Evolution strategy is chosen according to the probability through roulette wheel selection based on statistical self-adaptive learning framework.In MCDA, KKM and RC are set as the two objective functions; strategy 6 which has the largest proportion of selection of the best individual in HSCDA is set as the main evolution strategy and the dominant solution set is kept with Pareto mechanism.Experiments show that HSCDA has higher solution quality compared with other community detection algorithms which use  as the objective function (such as GN, FN, and BGLL).Compared with HSCDA, MCDA can obtain true structure of some of the real world networks and achieves competitive results compared with other multiobjective community detection algorithms (such as MOGA-net, MOCD, and MOEA/D-net).

Figure 2 :
Figure 2: Selected proportion of strategies of the optimal solutions.
in benchmark

Figure 3 :
Figure 3: Max NMI values averaged over 30 runs for different algorithms.

Figure 4 :
Figure 4: Cluster results of MCDA in Karate Club network.
Figure 4(b) shows the community structure corresponding to highest  value.It is obvious that Figure 4(b) is the subgraph of Figure 4(a).

Figure 5 (
Figure 5(a)  shows that MCDA obtains the true community structures of Dolphin social network (NMI = 1).Figure5(b)  shows MCDA divides the structure on the right part in Figure5(a) into 3 smaller communities.Thus, from optimizing modularity  point of view, MCDA is also effective for detecting the community structures of Dolphin network without wrong clustering.Some nodes in Football network are not connected with nodes in the same community, while the connection between nodes of this community and nodes of other communities is more close.When the network is in the real clustering, the modularity  is −0.0239, which is much less than  value obtained by the algorithm.It shows that the true clusters are not completely complying with network community cluster rule.Because of the complicated structure, it is difficult to completely detect its real cluster.According to the cluster results from Figure6(b) with max , MCDA obtains 10 clusters.We observed that some nodes like 12, 25, 29, 37, 43, 51, 59, 60, 64, 70, 81, 83, 91, 98, and 111 are misplaced.Figure6(a)shows community structures detected by MCDA with NMI = 0.9269; it still has a good reference value because of the high NMI.Similar to Football network, Books on US politics network itself shows high complexity.From the comparison of Figures7(a) and 7(b), although part of nodes is misplaced and real clusters cannot be completely detected, it can still make NMI be 0.6283 and  be 0.5264, which is meaningful in terms of solution precision.

Figure 5 (
Figure 5(a)  shows that MCDA obtains the true community structures of Dolphin social network (NMI = 1).Figure5(b)  shows MCDA divides the structure on the right part in Figure5(a) into 3 smaller communities.Thus, from optimizing modularity  point of view, MCDA is also effective for detecting the community structures of Dolphin network without wrong clustering.Some nodes in Football network are not connected with nodes in the same community, while the connection between nodes of this community and nodes of other communities is more close.When the network is in the real clustering, the modularity  is −0.0239, which is much less than  value obtained by the algorithm.It shows that the true clusters are not completely complying with network community cluster rule.Because of the complicated structure, it is difficult to completely detect its real cluster.According to the cluster results from Figure6(b) with max , MCDA obtains 10 clusters.We observed that some nodes like 12, 25, 29, 37, 43, 51, 59, 60, 64, 70, 81, 83, 91, 98, and 111 are misplaced.Figure6(a)shows community structures detected by MCDA with NMI = 0.9269; it still has a good reference value because of the high NMI.Similar to Football network, Books on US politics network itself shows high complexity.From the comparison of Figures7(a) and 7(b), although part of nodes is misplaced and real clusters cannot be completely detected, it can still make NMI be 0.6283 and  be 0.5264, which is meaningful in terms of solution precision.
is the node set of th community in all  communities, (  ,   ) = ∑ ∈  ,∈    , and |  | is the node number in   .The greater the value of the modularity density , the more accurate the community found.
For  = 1 : 5 //the number of propagation iterations is set to 5 For  = 1 :  //for all nodes in the network If (() > 1) 3.1.3.Hybrid Evolutionary Strategy Pool.In order to enhance the capability of evolution of the algorithm and thus to Input: Population with each node divided into different communities, that is, () = ,  ∈ {1, 2, . . ., } Output: Population after initialization For  = 1 : popsize //for all individuals in population 2 Is a Single Point of Double Crossing Crossover.Firstly, randomly select a node called V  in   and mark its label as    .Then all of nodes with the same label as    are set to the same label in   , thus generating a new individual   ; that is,    ←    , ∀ ∈ { |    =    }.Meanwhile, the node V  with label    in   is found out and then let all of nodes belonging to this community in   be set in the same label in   ,    ←    , ∀ ∈ { |    =    }, thus generating a new individual   .This process will generate two offspring individuals   and   .

Table 1 :
[19]jacent matrix  of network  Parameters: population size (popsize), max generations (gen), crossover probability (pc), mutation probability (pm) Output: Pareto front solutions.Calculate the rank of each individual If at least one objective value of individual  is better than that of individual , and all objects of  are not worse than those of , then  dominates .This is for each individual division level (rank), and the rank of all non-dominant individuals is defined as 1, and the other individual's rank plus 1 with the number of individuals who control it.(1.4)CalculatecrowdingdistanceCalculate the distance between one individual and other individual in the same rank by the crowding distance calculation method refers to[19].Step 2. Adopt the evolutionary strategy 6 to generate offspring individuals Step 3. Pick out the dominant solutions of current generation from the populationThe rank of all individuals is calculated first, then select the individuals whose rank is 1 to construct dominant solutions of the current generation.Step 4. Using the pruning mechanism to update the population (4.1) Combine the dominant solutions with the present population to form a new population (4.2) Calculate the rank of each individual and sort them from small to large.(4.3)Select popsize individuals as the next generation according to the rank.Step 5. Stopping criteria If (iterations < gen), iterations ++ and go to Step 2, otherwise, stop the algorithm and output the dominant set of solutions.Characteristics of four real world networks.is the mixing matrix which consists of vector  and vector ,   is the number of elements shared in common by th classification of vector  and by th classification of vector ,  .( .) is the sum of elements of  in row  (column ), and  is the number of nodes of the network.The value of NMI(, ) is in the interval [0, 1].If NMI(, ) = 1, then  = .If NMI(, ) = 0, then  and  are totally different.American college Football network [23], and Books on US political network (Polbooks) [24] are commonly used real networks for benchmarking.Characteristics of these four networks are shown inTable 1.For details, please see the related references.

Table 2 :
NMI of HSCDA, GN, FN, and BGLL in four real networks.

Table 3 :
value of HSCDA, GN, FN, and BGLL in four real networks.