A Unified Bayesian Model for Generalized Community Detection in Attribute Networks

. Identiﬁcation of community structures and the underlying semantic characteristics of communities are essential tasks in complex network analysis. However, most methods proposed so far are typically only applicable to assortative community structures, that is, more links within communities and fewer links between diﬀerent communities, which ignore the rich diversity of community regularities in real networks. In addition, the node attributes that provide rich semantics information of communities and networks can facilitate in-depth community detection of structural information. In this paper, we propose a novel uniﬁed Bayesian generative model to detect generalized communities and provide semantic descriptions simultaneously by combining network topology and node attributes. The proposed model is composed of two closely correlated parts by a transition matrix; we ﬁrst apply the concept of a mixture model to describe network regularities and then adjust the classic Latent Dirichlet Allocation (LDA) topic model to identify community semantically. Thus, the model can detect broad types of network structure regularities, including assortative structures, disassortative structures, and mixture structures and provide multiple semantic descriptions for the communities. To optimize the objective function of the model, we use an eﬀective Gibbs sampling algorithm. Experiments on a number of synthetic and real networks show that our model has superior performance compared with some baselines on community detection.


Introduction
With the advent of the era of big data and the diverse channels for acquiring data, we have obtained a large amount of data from complex systems in the real world [1]. In particular, we can obtain not only diversified entities in complex systems but also a variety of related descriptions (attributes) of them. Attributed complex networks are usually used to analyze and study these data [2,3]. Taking social systems as an example, nodes denote individuals and edges represent interactions between them. At the same time, individuals have personal information about gender, age, country, job, race, and so on, which represent their unique attributes. e sufficient and effective application of structural and attribute information is of great value for complex network analysis.
At present, exploring the structural regularities and functions of the network is a significant part of complex network analysis [4]. One of the most essential tasks is community detection. It is believed that nodes within the same community typically have similar structural characteristics and properties. e detection of communities or modules in a network is conducive to understanding organization rules of complex networks, exploring latent patterns, and predicting the behavior of complex systems. A number of successful community detection approaches have been proposed, which fall into different categories, such as hierarchical clustering algorithms [5,6], modularity optimized approaches [7], statistical inference [8][9][10], spectral algorithms [11][12][13][14], generative model [15][16][17][18], and Markov dynamic algorithms [19][20][21]. For review, the readers can refer to [22]. However, most conventional community detection methods only consider the network structure but ignore the attributes of nodes. In fact, the attributes of nodes help to improve the performance of community detection, because nodes with similar attributes tend to belong to the same community [23,24]. Different from network structures that specify node connectivity, node attributes provide the semantics of nodes and underlying network [15]. erefore, when the nodes in the network are divided into different communities, the node attributes in the same community can reveal the community semantics, which is somewhat similar to the Latent Dirichlet Model (LDA).
us, the missing structural information can be supplemented and more in-depth community detection can be carried out when semantic information and structural information are used complementarily. Recently, some methods have also been proposed to combine the attributes and structural information for better community detection. ey include heuristic-based methods [3,25] and probabilistic inferencebased methods [26,27]. In addition to obtaining better results of community detection, the node attributes also provide semantic descriptions of the communities. ese descriptions help to reveal why certain nodes are divided into a group and understand the functions of communities. erefore, detecting communities and identifying the underlying semantics of communities make complex network analysis full of significance. Some methods have been developed in [15,28].
Most methods that have been proposed for community detection are typically only appropriate for assortative community structure; i.e., the nodes within a community are densely connected [22,29]. ey usually assume that such certain structural regularity exists in the target network. However, the assumption may not always correspond to the true intrinsic structure of the network, which limits the applicability of the existing methods. Beyond that, there are other types of important structural regularities in the real networks and the networks may contain multiple structures simultaneously, for example, disassortative structure (bipartite structure) [30], i.e., a kind of structure pattern in which most of the edges are across different communities and mixture structure, i.e., a kind of structure contains both assortative and disassortative structures, and so on. Due to the rich diversity of community regularities in real-world networks, there may be several unknown types of structures in the networks. erefore, it is urgent to propose some methods to adapt to the realistic situations and to carry out generalized community detection. So, in this paper, we called these assortative and disassortative structures in the complex networks as generalized communities similar to [30]. Some methods [4] have been proposed to detect generalized communities in complex networks.
In particular, although node attributes may carry essential semantic information of communities, there are few ways to detect generalized communities, that is, detecting broad types of network structural regularities and combining network structures and attributes. Chen et al. [31] developed a Bayesian nonparametric attribute (BNPA) model and explored various types of network structures, but the model did not provide multiple semantic descriptions of the communities.
As a result, considering the rich diversity of community regularities in real networks, nodes attributes can not only improve the quality of generalized community detection but also identify the latent semantic characteristics of communities, identify the generalized communities, and provide semantic descriptions, which are worth studying in the complex network analysis. All the above methods neglect solving this twofold problem. Instead, we propose a unified generative model to detect communities in a wide variety of network structures without any prior knowledge of the certain type of intrinsic regularities in the networks. We also derive the semantic descriptions of the communities by combining the network structure and attributes at the same time. Our model is composed of two closely related parts by a probability transition matrix. e first is the topology part in which communities are described based on a mixture model, assuming that nodes in the same groups have similar link patterns (no matter whether there are more links within the communities or between communities). e second is the attribute part, in which semantic information is identified by the classic topic model (LDA) [23]. We assume that each community has several topics; i.e., the distribution of topics exists in each community. A probability transition matrix is used to reveal the potential corrections between topics and communities. It can handle the problem that the topics from attributes and the communities from networks are not well matched. We finally use a Gibbs sampling algorithm to optimize the objective function. Extensive experiments on a number of synthetic and real networks have shown that our model performs better than some baselines on community detection.
In summary, the contributions of this paper are as follows: (i) As we know, it is the first time we propose the generalized community in the attribute networks, in which the nodes have some link patterns with others and semantic similarity in the network (ii) We propose a unified generation model to analyze the attribute networks and detect the generalized community structure as well as its semantic description; it can describe the internal relationship between topological structure and node attribute of the network (iii) We also develop an effective Gibbs sampling algorithm and experiments show its better performance compared with some baselines

Related Work
To explore the network structural regularities, some methods for detecting generalized communities have been proposed. Recently, node attributes have attracted extensive attention in the complex network analysis. Newman and Leicht [30] developed a mixture model to explore the network structure with only links. In this method, the nodes with the same link patterns were divided 2 Complexity into the same groups. It modeled the relationships between communities and nodes. e probability that a node was connected to other nodes in the network was related to the community to which the node belonged. Closely connected nodes may not belong to the same community. us, a broad of structural signatures could be explored without any prior assumptions about the structure of the network. Hua-Wei et al. [4] focused on identifying the intrinsic structural rules in networks. In this model, the nodes within the same groups had a similar link preference to other groups. A block matrix was defined to denote the probability that the randomly selected edge linked two distinct groups. It could detect broad types of structural regularities by modeling network structures. ere were several methods for content analysis, such as Latent Dirichlet Model (LDA) [23]. e method focused on node attributes and identified the set of nodes whose attributes were similar. Several community detection approaches combining network topologies and node attributes have also been proposed. Some methods only used node attributes to improve the performance of community detection, while others provided the semantic descriptions of communities. Ruan et al. [25] proposed a method for determining the strength of the edges between nodes using content information, which is also applicable to graph clustering. Yang et al. [27] used a discriminative model that combines node attributes and network topologies to detect communities. However, this method focused on community detection without describing the relevant attributes of each community. It did not provide a semantic description of the community. Pool et al. [28] proposed a heuristic method to detect communities by optimizing the community scores.
is heuristic method reported too many relatively small communities, some of which had only two or three nodes. Chakraborty and Sycara [32] developed a model based on nonnegative matrix trifactorization method to detect communities via modeling network structure and contents. However, this method mainly used additional attributed information to identify communities and failed to infer the relationship between communities and attributes. Chen et al. [31] developed a Bayesian nonparametric attribute (BNPA) model to explore structural regularities in networks. is model combined network structures and node attributes for community detection and assumed that network structures and node attributes shared the same community memberships; i.e., attribute clusters and network communities were the same. However, attributes and community structures may not always align at all; they could not give multiple semantic descriptions of communities. Wang et al. [33] proposed a model that combined network topology and node semantic information to identify communities. It integrated topology-based community memberships and node-attributes-based community attributes (or semantics) in the framework of nonnegative matrix factorization. e model was based on two important observations: if the community memberships of two nodes are similar, they will have a high probability to produce adjacent edges, and if their attributes are related to the underlying community attributes, they will likely be in the same community. e use of node contents improved the result of community detection and provided a semantic description to the resultant network communities. He et al. [15] introduced a generative model consisting of two parts, one for communities and the other for semantics, exploring the network structure and interpreting the functional modules semantically. e method was only applicable to the network with assortative structures and failed to detect generalized community. More discussions on attribute networks can be found in related surveys by Bothorel et al. [34] and Chunaev [35].

Model Formulation
In this section, we give a formal description of the proposed model, i.e., Generalized Semantic Community (GSC) identification, with the purpose of generalized community detection and semantic identification in the networks.

Notations.
We define an attributed network G with N nodes and M attributes as an N × N adjacency matrix A and an N × M attributes matrix X. All the nodes and attributes are denoted Our model is specified by three types of quantities: (i) Observed quantities: the number of groups K, the number of nodes N, the number of attributes M, the adjacency matrix A, and the attribute matrix X (ii) Latent quantities: group labels z, where z i denotes the community membership of node v i , and the content memberships g, where g it denotes the topic labels of the node v i 's t-th attribute (iii) Model parameters: π � (π r ) 1×K , where π r is the fraction of nodes in community r; θ � (θ rj ) K×N , where θ rj is the probability that a certain node in community r connects to node v j ; is the probability that the s-th topic generates t-th attributes of node v i Table 1 shows the notations of the parameters.

Problem Definition.
Considering the rich diversity of community regularities in real networks, encoding network structure and node attributes simultaneously, and providing the semantic descriptions of the resultant network communities are still the problems that are worth studying in the community detection. However, most existing methods tend to ignore certain aspects of the problems that remain the challenges of current community detection. Given an attributed network, the goal of handling these problems is twofold: Complexity 3 (i) How to divide the nodes into communities and content clusters no matter what kind of network structural regularity the network is? (ii) How to identify the correlations between communities and attribute topics to provide the best semantic descriptions of communities?
So the problem can be formalized as, given the adjacency matrix A and attributes matrix X as well as the number of communities K, our goal is to obtain the community assignment z i for each node i and the topic distribution of the communities.

Model Definition.
To achieve the objective, we define a unified Bayesian probabilistic generative model to handle topologies and node attributes at the same time. Our goal is to divide the nodes in networks with extensive structural regularities into K communities and K content clusters, respectively, by using adjacency matrix A as well as attributes matrix X. To model network structure, we assume that the nodes in the same groups have similar link patterns; i.e., the probability of a node connecting to other nodes in the network is the link tendency between the community to which the node belongs and the rest of nodes. We also take a modified LDA model for node attributes. A transition matrix is used to jointly model network structures and node attributes, which connects network communities and attribute topics. To be specific, a community may be characterized by multiple topics, and the topic of each node attribute is derived from the topic distribution of the community to which each node belongs. en, by extracting the latent correlation between network communities and attributes clusters, multiple semantic interpretations can be provided for each community. Figure 1 shows a graphical representation of this model, and the generation process is as follows: (1) Sample π ∼ Dirichlet(α) (2) For each community r ∈ 1, 2, . . . ,

Generating Model Parameters. We introduce a
Bayesian treatment into the model generation process. After the number of communities K is given, model parameters are treated as random variables; we generate model parameters π � (π r ) 1×K , θ � (θ rj ) K×N , η � (η rs ) K×K , and ϕ � (ϕ st ) K×M , respectively, by the Dirichlet distribution. e parameters are generated based on some hyperparameters, denoted e generative process is as follows.
We use Dirichlet distributions to generate the following model parameters, respectively: Topic labels of the node v i 's t-th attribute

Model parameters
π r e fraction of nodes in community r θ rj e probability that a certain node in community r connects to node v j η rs e probability that node v i is in the s-th content cluster given that the community label is r ϕ st e probability that the s-th topic generates t-th attributes of node v i Hyperparameters α, β, c, ξ Acting as noninformation prior of corresponding model parameters with prior distribution  Complexity where Γ(•) represents a Gamma function. All the communities share the same β, and all the topics share the same c and ξ.

Generating Observed and Latent Quantities.
At first, we sample the latent community membership z i for every node v i from a multinomial distribution independently. It is described as After the latent community membership z i of nodes v i is explicit, we generate edge a ij as the following definition: where θ rj denotes the "preferences" for any node in community r to link to node v j , regardless of which community that node v j is in. Nodes in the same community have a common link "preference" without any assumptions about network structure regularities. us, generalized communities can be detected. en, we sample the latent topics membership g it for each attribute ω t of node v i from a multinomial distribution independently, defined as As η rs denotes the probability that node v i is in the s-th semantic topic while it is divided into r-th community, that is, η rs provides the transition from communities to topics, the topic assignment and community membership of node do not always match well. is is why the community may have several topics.
We generate attributes ω t as the following definition: en, the probability of the network G with N nodes and M attributes is It is subject to

Model Optimization
To exactly infer that the latent variables z and g are intractable, we use Gibbs sampling [36] and slice sampling [37] to sample the latent variables z and g and hyperparameters (α, β, and c), respectively. with where m j r denotes the number of outlinks whose tail nodes belong to r and whose head node is v j ; n r denotes the number of nodes in community r; M t s denotes the number of ω t which is generated by topic s; and N s r denotes the total number of topics s generated by community r. e inference process is in Algorithm 1.

Sampling z.
For each node v i , given the community assignment for all other nodes, the community probability of the node z i choosing community r is where L i denotes the outlinks of nodes v i ; m r,i denotes the number of outlinks from community r except node v i ; m j r,i denotes the number of outlinks from community r except edges a ij ; n r denotes the number of nodes in community r; N is total number of nodes; g i denotes the topic labels of the attributes of v i ; m i denotes the attributes of v i ; g it denotes the topic of v i 's t-th attribute; M t s,i denotes the number of node attributes whose topic is s except v i 's attribute ω t ; M s,i denotes the number of nodes' attributes whose topic is s except the attributes of v i ; N s r,i denotes the total number of topics s generated by community r except v i ; and N(is) denotes the attributes of v i whose topic is s.

Sampling g i .
For node v i in community r, given the topic assignment for all the attributes except the attribute ω t , the topic probability of the attribute ω t choosing topic s is where M t s,it denotes the number of ω t whose topic is s except v i 's attribute ω t ; M s,it denotes the number of all the attributes whose topic is s except v i 's attribute ω t ; N s r,it denotes the number of nodes' attributes whose topic is s and whose nodes belong to community r except v i 's attribute ω t ; and N r,it denotes the number of node attributes that belong to community r except v i 's attribute ω t .

GSC Models.
Our model can also only handle edges or nodes' attributes in the networks. 6 Complexity

GSC-Link.
e probability of only considering the links can be written as e community probability of node i choosing community k is

GSC-Attr.
e probability of only considering the attributes can be written as p(X, z, g | α, c, ξ) � p(X | g, ξ) · p(g | z, c) · p(z | α). (14) e community probability of node i choosing community k is e topic probability of the attribute ω t choosing topic s is the same as GSC.

Experiments and Analysis
Firstly, we experiment on three different synthetic networks with different structure regularities (i.e., assortative, disassortative, and mixture structures) to evaluate the quality of community detection and analyze the superiority of modeling on the network with a rich diversity of structures. en, we assess the interpretability of communities in an online music system. Finally, we evaluate on real networks and do a comparison with stateof-the-art methods.
Require: adjacency matrix A, attributes matrix X, iterations T, and specified group number K Ensure: group assignment z 0: initialize α, β, c, ξ, set n r , m r , m j r , M s , M t s , N r , and N s r to 0 Initialize each node's latent community label z i (1)//sampling z, g i , α, β, c, and ξ (2)for te � 1 to T do (3) for i � 1 to N do (4) //get the current community assignment of node v i (5) update n r , m r , m j r , M s , M t s , N r , and N s r (6) for k � 1 to K do (7) compute probability p(z i � k) according to equation (8

Complexity
As the ground truth of communities in the networks is known, we use the following Normalized Mutual Information (NMI) [38] to compare all the methods: where G � (G 1 , G 2 , . . . , G k ) is the ground truth of communities in the network, and G ′ � (G 1 ′ , G 2 ′ , . . . , G k ′ ) is the community identified by the method. H(G) and H(G ′ ) are the entropies of G and G ′ , respectively, and MI(G, G ′ ) denotes the mutual information between them. e higher NMI is, the better the result is.
To describe parameter estimation in GSC more adequately, we describe the changing trend of likelihood function with the number of iterations in Figure 2(a), and each curve in Figure 2(b) shows the changes of the loglikelihood of Cora with one of four hyperparameters when other hyperparameters are determined by slice sampling. It can be seen that the log-likelihood of GSC quickly converges at about 150th iteration. e log-likelihood probability is less sensitive to α, β, and ξ while c made a big difference.

Experiment on Synthetic Networks with Different Structure
Regularities. Firstly, we conduct experiments on synthetic networks to evaluate the quality of community detection. en, we assess on real networks and do a comparison with state-of-the-art methods. e first synthetic network is a random network in Newman's method [15]. e network consists of 128 nodes divided into 4 disjoint communities with z in + z out � 16. As ρ(� z in /32) > ρ(� z out /96), z in (the edges linking to nodes within community) is much larger than z out (the edges linking to nodes in other communities). For every node v i , we generate a 4h-dimensional binary attribute (i.e., x i ) to divide the nodes of 4 content clusters with h in + h out � 16. In this paper, h in denotes the number of attributes for every node v i with x it � 1 associated with its community and h out (noisy attribute) denotes the number of attributes for every node v i with x it � 1 corresponding to the other communities. In particular, we generate the (s − 1 × h + 1)-th to (s × h)-th attributes for each node in the s-th cluster by a binomial distribution with mean ρ in � h in /h and generate the remaining attributes by the binomial distribution with mean ρ out � h out /(3h).
We set h � 50 and consider that the topologies and contents share the same membership. e node attributes' matrix and the community attributes' matrix are shown in Figure 3. We first set z out � 8 and change h out from 0 to 12 with an increment of 1. We adapt GSC-link using network topology alone as the baseline method. Other comparison methods are NEMBP [15] and SCI [33], which use both network topologies and attributes. As shown in Figure 4(a), our method can use the complementary structural information in node attributes to improve the quality of community detection when h out < 12. Even when h out � 12, the cluster structures of node attributes disappear; our model GSC can get better results than baseline method GSC-link. en we set h out � 8 and change z out from 0 to 9 with an increment of 1. As shown in Figure 4(b), our method also can perform better than GSC-attr. In general, the proposed method can get better results of community detection by using topology and content information. e second synthetic network is Newman's model [30] of 108 nodes. It consists of 8 keystone nodes without community labels and other nodes link to them according to their community membership. e remaining 100 nodes are equally divided into 4 groups, and the edges between these nodes are randomly linked, with the mean degree of every node being 10 In particular, each community has a unique signature set of keystones, and only the link pattern to keystones can identify the community; thus the structure of this network is neither assortative nor disassortative.
At first, we study the influence of noise attributes on community detection. ρa represents the proportion of noisy attributes of each node. We change the probability of noisy attributes ρa from 0 to 1 with an increment of 0.1. e node attributes' matrix is shown in Figure 5. When ρa becomes larger, the attributes associated with each community are blurred and less discriminant information is provided for the network community. As shown in Figure 6(b), we almost divide the nodes into 3 communities while only considering network structure. e result gets better when using node attributes in Figure 6(c). As shown in Figure 7(a), our method outperforms GSC-link (even ρa reaches 0.7) and significantly outperforms SCI and NEMBP. It shows that the quality of identified communities improves combining node attributes and network structures. Our model GSC is able to fully use network structure information even if the information of node attributes is erroneous. As ρa increases beyond 0.7, GSC performs worse. It also reveals that node attributes with terrible quality can lower the result of community detection. Figure 7(a) also shows that NEMBP performs worse than GSC-link when ρa reaches 0.2 and SCI performs always much worse. Figures 6(b) and 6(c) represent the results of GSC and NEMBP, respectively, when ρa is 0.5. It can be concluded from the above analysis that GSC is more capable of identifying the networks with mixed structural regularities than SCI and NEMBP.
In this network, the propensity to link to the unique set of keystone nodes determines the group membership. We change the keystone links of each group to change the network structure by varying the keystone links of each group from 100 to 10 with a decrement of 10. We set the probability of noisy attributes ρa � 0.5. We adapt our model with only attributes as the baseline method and NEMBP for comparison. As can be seen in Figure 7(b), our method is also able to perform well even if the keystone links are only 30.
e new model represents strong robustness to the changes of network structure. However, the rambling result of NEMBP indicates that it does not work very well for this type of network. e third network [31] has both a community and a bipartite structure with 100 nodes and 402 edges as shown in Figure 6(e). e 100 nodes are equally divided into 5 groups, 8 Complexity  Complexity three of which form an assortative structure, whereas the remaining two form a bipartite structure. For each node v i , we generate 5 × 50-dimensional binary attributes; each of the communities and nodes has 50-dimensional relevant attributes. We change the probability of noisy attributes ρa from 0 to 1 with an increment of 0.1. As shown in Figure 7(c), our model always gets better results than NEMBP and SCI. Even when ρa � 0.6, the quality of identified communities is also improved compared with GSC-link, and the NMI is almost 1. Figure 6(f ) shows the result of NEMBP when ρa � 0.6. Its performance is much worse than that of GSC.

Evaluating Efficiency.
In this part, we evaluate the efficiency of community detection methods by measuring each method's running time on synthetic networks as we increase the network size. e comparison methods are NEMBP and SCI. e synthetic networks include assortative and disassortative structures. e edges are placed uniformly at random within and between communities in certain numbers. e number of edges within each community is set to 1,200 and the number of edges between a community and the others is set to 600. ey form a community structure. e rest of the communities are divided in pairs, the number of edges between two communities in each pair is set to 2,400, and the number of edges between communities in different pairs is set to 1,200. Each pair of groups forms a bipartite structure. e maximum number of nodes in our synthetic network is 7,000, including 12,6000 edges and 700 attributes. We change the scale of the network (Syn-100, Syn-500, Syn-1000, Syn-2000, Syn-3000, Syn-5000, and Syn-7000). e synthetic network of 100 is the third network that we used above. For each synthetic network, we generate 10 K-dimensional binary attributes. We set the ratio of noise attributes to 0.5. Figure 8 shows the running time of methods versus the network size. Our method is the fastest among the three. When the program runs to convergence, the running time of our method on Syn-7000 is about 5 minutes. For NEMBP, we set the number of iterations in the program to 10; the running time of the program can reach 11 hours even on Syn-2000. e running time of SCI is more than 19 hours.

A Case Study.
In this paper, we use (η rs ) K·K to correlate the communities and attribute topics and evaluate whether it contributes to the descriptions of the communities. We intensively analyze the underlying semantics of communities and provide particular descriptions for some of the communities detected by GSC. us, we use the LASTFM dataset, which is a social network from an online music system, that is, Last.fm. It includes 1,892 users and 11,946 attributes of user's favorite music singers and tag assignment. In this network, the ground truth of community partition is unknown, so we decide to detect 38 communities as in [15]. We find that the communities may have one main topic or multiple topics; a detailed analysis of the three detected communities with different topics is shown in Figure 9. e first example in Figure 9(a) is a community with one main topic. It should be the fans of popular female singers like "Rihanna" and "Britney Spears." eir music are "pop," "rock," and "dance." ey are both "female vocalists" and "sexy." As for the community in Figure 9(b), it is a group of fans of "hardcore punk" music. e hardcore punk is also labeled as hard rock. Glam-sleaze music is a derivative of hard rock and alternative rock coming from a post-punk band. Grunge music is a music genre of indie rock which evolved from hardcore punk. Emotionally-Driven Hardcore Punk (EMO) is an indie rock style, and the Screamo originated from EMO. e last community has two major topics. e communities shown in Figures 9(c) and 9(d) are about the fans of electronic music. One topic is mainly about Electronic Body Music (EBM), which combines elements of industrial music and electronic punk music. e other topic is about IDM. is kind of music was created in the late 80s accompanied by hard edge dance and slow music.

Experiment on Real
Networks. Cora, Citeseer, Terrorist, and Biology are four real networks with both links and contents that we apply in this paper. Cora is a part of Cora citation networks, including 2,708 published articles and 5,429 edges. Each publication is represented by a 1,433dimensional binary word vector which means the absence or presence of the relating words. e total publications are divided into seven communities. Citeseer is a subset of Citeseer citation networks. It includes 3,312 published articles and 4,732 edges. Each publication is represented by a 3,703-dimensional binary word vector. e total publications are divided into six communities. e Terrorist dataset consists of 1,293 terrorist attacks; each attack is assigned one of 6 labels indicating the type of the attack. Each attack is described by a 106-dimensional binary word vector whose entries indicate the absence or presence of a feature. Biology is a real paper citation network, which is from 435 different biological journals. It contains 10,000 papers connected by links. Each paper is described by a 9,944 0/1-valued keyword vector; two papers are connected if they have a reference relationship.
ere are 435 nodes representing different  biological journals in the network; each paper links to them according to the journal in which it is published. So, the network forms a mixture structure that is similar to the synthetic network of 108. All the papers are split into 435 groups; each group contains papers published in a certain journal. We also use Syn-2000, which includes both community and bipartite structure. e five networks are shown in Table 2.
We compare our GSC model with the methods from three categories: (1) models based on only network structures, that is, GSC-link; (2) models based on only network attributes, such as GSC-attr and LDA; (3) models based on both structures and attributes, such as PCL-DC, NMMA, SCI, and NEMBP. e results of these models on three networks are shown in Table 3. Our model can use the information of network structure and node attributes simultaneously to identify communities. e model GSC outperforms the other models on Cora and achieves larger NMIs than most of models on Citeseer and Terrorist. e result of GSC is lower than that of NMMA on Citeseer.
is is mainly due to the fact that network structures and node attributes are more likely to share the same community memberships. NMMA assumed that attribute clusters and network communities were the same, so it performs better on Citeseer. Sometimes, the community structure is not so obvious when considering only the structural information of the network. e nodes are divided into communities mainly by using their attributes. In this situation, our model can effectively use the information of the attributes. e models based on structure and attributes usually outperform the models with only link or attributes.

Conclusions
In this paper, we propose a novel Bayesian probability model to detect generalized communities and identify the semantics combining network structures and nodes attributes (c) (d) Figure 9: e examples show the word clouds of the main attributes of communities. e sizes of word indicate the probability that they belong to a topic.  and use an efficient Gibbs sampling algorithm to optimize the objective function. Even if the information of node attributes is of poor quality, our method can use the complementary structural information in node attributes to get better results. e model assumes that the network structure and node attributes have different hidden variables and adopts a transition matrix to explore the hidden correlation between communities and topics.
us, it can provide semantic descriptions of communities to better reveal the characteristics of communities. We evaluate our method on a number of real and synthetic datasets and in a case study.
e new method can detect various types of network structures and outperforms several state-of-the-art algorithms.
It is similar to the proposed methods in requiring that the number of communities be provided. is problem is about model selection issue, and we will focus on determining group number automatically in the next step.

Data Availability
e datasets used to support the results of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.