A Topic Space Oriented User Group Discovering Scheme in Social Network : A Trust Chain Based Interest Measuring Perspective

Currently, user group has become an effective platform for information sharing and communicating among users in social network sites. In present work, we propose a single topic user group discovering scheme, which includes three phases: topic impact evaluation, interest degree measurement, and trust chain based discovering, to enable selecting influential topic and discovering users into a topic oriented group. Our main works include (1) an overview of proposed scheme and its related definitions; (2) topic space construction method based on topic relatedness clustering and its impact (influence degree and popularity degree) evaluation; (3) a trust chain model to take user relation network topological information into account with a strength classification perspective; (4) an interest degree (user explicit and implicit interest degree) evaluation method based on trust chain among users; and (5) a topic space oriented user group discovering method to group core users according to their explicit interest degrees and to predict ordinary users under implicit interest and user trust chain. Finally, experimental results are given to explain effectiveness and feasibility of our scheme.


Introduction
Currently, user group in social network site (SNS) has been garnering increased attention in fields of topic related opinion expression and information sharing [1].Commonly, there is a set of related topics, which interest all members in the user group.Therefore, individual users who maintain high interests in the set of topics would join the group and interact with other members conveniently.By joining topic associated group, users can deliver their attitudes and discuss with other members and share other related information about the topic.In user group, group related information would be shared more rapidly.Thereby, user group impacts users more deeply than other organizations.Generally, a user group which is related to more influential topics would obtain more attentions and have a larger impact in social network.Thus, how to discover influential user groups to attract more users is a significant problem for social network analysis.
(1) Theoretical Background and Consideration.Topic is the primary factor for user group.The topics, which users discuss and communicate around, should have close relations to make sure that all members in the group have most common interests.In addition, the topics must have large impacts in SNS to attract mass users' interests into the user group.From this point, it is indispensable to find out a set of topics that have close relations and large impacts in SNS.
Interest reflects the sense of concern and curiosity about the topics that have the power of attracting or holding users' attentions in SNS.Thereby, user's interest about specific topics is another significant factor to evaluate whether a user has probability to join the group or not.That means the more 2 Scientific Programming he/she is interested in the topics, the more likely he/she is to be a member of the user group.
Additionally, information is propagated in a fission pattern based on a large-scale social relation network which is formed by users' relationships, such as friend relationship and followed relationship [2].Such propagation pattern of users' relationships will also facilitate detecting and organizing user groups, since there would be more probabilities of maintaining similar interests among users who have closer relationships.That means user relationships in social network provide an important parameter to calculate and evaluate user group discovering.
Based on above consideration, the aim of our work is to find the most influential topics which are related to each other in SNS and then organize users who keep interests in these topics into groups to achieve information sharing through channeling close user relationships.
Currently, scholars cluster topics mainly through topic detection technology to construct network model for user grouping [3][4][5].However, not only is user group organization based on users' behaviors, such as sending, forwarding, or accepting, but also it contains implicit effects of social relationship in SNS [6].It is more probable that information would be shared among mutual trusted users.Therefore, the effectiveness of social relationships cannot be ignored in user clustering and information propagation.Many existing researches have already explored that relationship among people plays key influence on information propagation [7].The interests of users, which cause the people to be clustered in user groups [8], can also transfer through their trust relationships.Thus, relationship is critical to discover user group accurately in SNS.
Trust reflects user's confidence or faith to others based on his past experiences or other factors.It has been used to measure the closeness of relationships among users and calculate related reliability in social network [9,10].Users can pursue their favorite items, news, and other related information about the topics and also be concerned with or even accept information which is related to them or sent by trustworthy persons in SNS [11].In our consideration, user group is formed by the trustworthiness among users and essentially reflects their confidences towards a specific topic.Such pointto-point trust relationships would bring users together and form group under their common interests.Thereby, we can accurately discover qualified users to organize topic user group based on trust chain in SNS.
(2) Main Contributions of Proposed Work.In this paper, we propose a topic space oriented user group discovering scheme based on trust in social network, which is composed of three phases: topic space detection, interest evaluation, and user grouping based on trust chain.Firstly, we address an overview of our scheme and give related definitions, that is, graph model of social network, topic space, user interest, and trust chain.Secondly, we propose a detection method of core topic set (named topic space in this work) through topic impact evaluating and relatedness clustering.Thirdly, we present the user interest evaluation method including explicit and implicit interest degree.Then, we address a user grouping method for discovering users based on trust chain in SNS.Finally, we perform experimental analysis to verify the effectiveness and feasibility of our method.The main contributions of this work include (1) putting forward a topic space construction method based on topic relatedness clustering and impact evaluation, including influence degree and popularity degree evaluation; (2) setting up a trust chain model by taking user relation network topological information into account with a strength classification perspective; (3) presenting a user interest degree evaluation method which involves explicit and implicit interest degree calculation based on trust chain prediction; (4) proposing a topic space oriented user group discovering method.By this method, core users who have large explicit interest in topic space are grouped according to explicit interest evaluation and ordinary users who have implicit interest are further estimated based on trust chain in social network.

Community Discovering in Network.
Community discovering in network environment is a traditional research area [12][13][14][15].In many existing works, networks including SNS, P2P, or distributed system were characterized by graph theory.In graph theory, machines or users are regarded as a set of vertices and their relationships or communication interactions are described as a set of edges.On this basis, communities with features of small-world network [16] or scale-free network [17] can be defined as induced subgraphs of the network graphs.There are dense and tight links among vertices (nodes) in community, while their relationships are sparse and loose outside of the community [14].Essentially, community discovering finds out that the vertices have relatively dense links according to the topological structures and graph features of the network.Many efforts have been made for network community discovering.Girvan and Newman proposed a method of detecting community, called GN algorithm, by using the property of community structure.In this structure, network nodes are joined together to form tightly knit groups; however, between groups there are only looser connections [12].Besides, other methods such as NN [18] and -means clustering algorithm [19] are also widely used in discovering methods.

Influence Evaluation in Social
Network.In the area of impact evaluation, a lot of researches about the maximized influence of social network have been done [20].Chen and his colleagues propose a series of works about influence maximization such as greedy algorithm evaluation [21], influence diffusion dynamics and influence maximization [22], and scalable influence maximization under the linear threshold model [23].In our consideration, relations among topics and their popularities are two significant aspects in influence evaluation.Since topics are not independent in SNS, there should be inherent relations for topics, and influential topics would be likely to link to other more influential topics.In addition, popularity is another explicit dataset showing the influence of a topic directly.Therefore, we take the above two aspects (relation among topics and their popularity degrees) that are not addressed in traditional works into account to evaluate the influence of social network.

User Interest Analysis.
User interest analysis has been used in many fields, such as online user clustering, recommender system, and service quality evaluation.Zeng et al. [24] proposed a user interest analysis method based on user activities on the web.Li et al. [8] addressed a method based on user interest popularity distribution in recommender system.Hegde et al. [25] presented an approach that automatically assigns tags to places, based on interest profiles and visits or check-ins of users at places.Most of these traditional studies are based on users' explicit data, such as behaviors, profiles, or other related data.That means only explicit interest was analyzed.However, many users keep their interest implicitly and did not express their interests explicitly in SNS.These users expressing their interests implicitly would be lost under those traditional measuring methods.Obviously, it is insufficient to discover users just based on their explicit interest.Implicit interest is another important criterion to find potential users.That is why we take it into account for user interest analysis in this paper.

Trust Computation.
There have been a large number of researches on trust and reputation in the past decades [10,26,27].Many methods, such as summation/average/iteration of past trust rating [26] and Bayesian model [27], have been proposed to optimize one or more aspects of trust computation performance.In addition, the weighted average of ratings method is a typical and widely used method in trust computation [28].In this method, all trust ratings about the target object are aggregated and then a weighted average of the aggregation is calculated as trust value.In social networks research, there are also many works for trust computing.Javier Ortega and colleagues propose a method to compute a ranking of the users in a social network and propagate both positive and negative opinions of the users [29].The opinions from each user about others can influence their global trust score.Qureshi et al. propose a decentralized framework and the related algorithms for trusted information exchange and social interaction among users based on the dynamicity aware graph relabeling system [30].In [31], an extended Advogato trust metric is proposed to facilitate the identification of trustworthy users and diffuse a capacity of a target user throughout personal network.Golbeck proposed TidalTrust that gets trust in social networks using numeric trust values [32].It utilized the shortest path based on the breadth-first search.Furthermore, TidalTrust can be used to retrieve accurate information from the highest trust adjacent nodes.
Different from traditional network community, topic user group is composed of users interested in the same topic.Members of topic user group might disperse in different locations in network and do not have tight and frequent interactions with each other in past.Therefore, there are the following considerations for topic user group discovering in our view.(1) User's interest degree is a significant factor for measuring whether a user should be detected in the user group.Those who maintain strong interests on the topic are definitely core member of user group.(2) There are many users who keep high interests on topic and do not express their interests explicitly.These potential members should be recognized in the user group.(3) Interests may be transferred through users' relationships based on their trustworthiness.That is, if there is a pretty high level of trust between two users, they might have great possibility to keep similar interests on the same topic.For example, one of them would have his positive or negative effect on another through their trust relationship.Thereby, trust is a crucial linkage among users to share their interests in common and plays an important role in user group discovering.

Overview of Our Scheme
Topic space oriented user group (TUG) is organized by three phases: topic space detection, interest evaluation, and user grouping based on trust chain.The topic space gathers influential and interrelated topics that can attract peoples' attentions and public concerns.Specifically speaking, it is meaningless to detect and organize a user group about inessential and unremarkable topics.User interest reflects how interested he/she will be in the topic.That is, interest degree is the criterion for evaluating and grouping user into a TUG.User relationship model, called trust chain model in this paper, reflects the close degrees among users.Through trust chain, we can measure the probabilities of users' topic interest similarities and then group those with mutual trusts in the same TUG.
The overview of our scheme is shown in Figures 1(a)-1(e) as follows.(a) Topics are linked through their relations (black lines) in SNS.(b) Users are linked via their trust chains (blue lines) in SNS.(c) The impacts and relatedness degree of topics are evaluated according to indicators of topic rank and popularity.Then, influential and close related topics are clustered into topic space (marked in red in Figure 1(c

)). (d)
The interest degrees of users towards the topic are measured based on explicit or implicit interest.(e) Core users of TUG can be identified according to their explicit interests (the core users are marked in red with their interest degrees in green dash lines in Figure 1(d)).Furthermore, ordinary users of TUG are detected based on selected core users and their trust chains (ordinary users are marked in pale red in Figure 1(e)).
Correspondingly, we address the following definitions in this paper.
Firstly, we introduce the graph theory for modeling the social network formally.
Definition 1 (social network graph model).Social network graph model can be described as SN = ⟨, ⟩, where  = {V 1 , V 2 , V 3 , . ..} is the nonempty set of vertexes which denote the users in SNS, while  = { 1 ,  2 , . ..} is the set of edges which denote user relationships among users.
Through Definition 1, we can describe the trustworthy relationship among users by vertexes (users) and edges (user relationships).That is, if a user V  keeps a trust relationship with another user V  , the trust can be described as (V  , V  ) = ⟨V  , V  ⟩.Definition 2 (topic space).Topic space is a set of topics which have large impacts and close relations.Topic space can be defined as TS = ⟨{Θ 1 , Θ 2 , . ..}, imp(TS)⟩, where {Θ 1 , Θ 2 , . ..} is the set of topics in topic space; Θ  denotes a topic in the topic space and it can be described as Θ = ⟨content, impact⟩, and imp(TS) is the impact degree of whole topic space.
In the above definition, there are two elements for describing a topic: content and impact; content = ⟨core, Parent, Subtopic⟩ which contains core content of topic (core), its subtopic set (Subtopic = {st 1 , st 2 , . ..}), and its parent topic set (Parent = {pt 1 , pt 2 , . ..}), while impact ∈ [0, 1] denotes the weight value of the topic influence.Meanwhile, imp(TS) is the impact degree of the topic space, which is an integrated value by combining all the impact degrees of topics.Definition 3 (interest degree).Interest degree of user reflects the quantified value of user's interest level about a specific topic or a topic space, which can be used to predict the probability of joining a topic group for users.
In this study, we give two kinds of interests for users as follows: explicit interest and implicit interest.In our consideration, the interests expressed by users' direct behaviors, such as judgments, browsing time, approving, and forwarding, are defined as explicit interest, while the potential feelings or opinions which have not been expressed by users are seen as implicit interest.Apparently, explicit interest can be evaluated directly through users' past behaviors, and the implicit interest can be extracted through users' relationships since users are linked through their relationships in social network and such relationships enable revealing the possible implicit interests.That is, we can estimate implicit interest through their trustworthy relationships, which are regarded as trust chains in this paper.For example, if user , who has no direct evidence to express his interest in a topic, keeps a very high trustworthy degree to his friend  who has a strong interest in the topic, we can predict that user  might have a certain interest in the topic.In this example, explicit interest is delivered through users' trust relationship and thus generates implicit interest, which is the underlying rationale for implicit prediction in this work.Correspondingly, users in social network have their interest degree of both topic and topic space.
Additionally, relationship is another significant entity connecting users in SNS.Consequently, it can be used for evaluating closeness degree among users, predicting implicit interest degree, and further organizing users to form groups in this work.To achieve that, we use the notion of trust to reveal the relatedness among users.That is because there would be more probabilities of users to share common interests and then join the same user group if they trust each other.In this work, the trust relationships, including direct relationships and indirect relationships among users, are defined formally as a conception, trust chain, as follows.
Definition 4 (trust chain).Trust chain is a model for describing the direct and indirect link among users.It reflects the trustworthy relationship and can be defined as Ω = (, , , , ), where  ⊆ SN ⋅  denotes nonempty set of user nodes in trust chain and the user nodes can be divided into three roles as follows: source user nodes   , intermediate user nodes   , and target user nodes   ;  ⊆  ×  ⊆ SN ⋅  denotes the finite set of atomic trust chain;  ⊆  ×  ∪  ‖  denotes the combined trust chains which are composed of atomic trust chains and symbols × and ‖ denote serial trust chain and parallel trust chain, respectively;  denotes the chain classification of trust chain; and  :  → [0, 1] ∪  → [0, 1] denotes trust value of atomic trust chain or combined trust chain.
In addition, there are two categories for trust chain: by topological route composition and by strength of trust chain.Firstly, since there are different route compositions of indirect trust chain among users, we divide trust chain into four kinds: atom trust chain, serial trust chain, parallel trust chain, and combined trust chain.Meanwhile, to signify the strength of trust chain and define the constraints of trust chain strictly, we classify trust chain as strong and weak trust chain.Details of trust chain will be discussed later.Through Definition 4, the direct and indirect trust chains can be described formally according to the topological composition of the relationships among users.
Definition 5 (topic space oriented user group).In SNS, topic space oriented user group (TUG) can be defined as a 3-tuple Δ = (TS, , ), where TS denotes the topic space;  = ⟨  ,   ⟩ is nonempty set of users in which   is the core user set and   is the ordinary user set;  = { 1 (TS),  2 (TS), . ..} denotes the set of users' explicit or implicit interest degrees of topic space, respectively.
A TUG contains a topic space and a set of users who maintain strong interests to it.With the consideration that many users do not express their interests through explicit behaviors or evidences, their implicit interests can be estimated and evaluated through their trust relationships with others.Therefore, there are the following properties for TUG in this paper: (1) For user set  in Δ,   ̸ = ⌀.
(3) For each  ∈ , if s/he has an explicit interest (EI  (TS)) to topic space TS, his/her interest value satisfies condition EI  (TS) ≥ .
(4) For each  ∈ , if s/he has an implicit interest (MI  (TS)) to topic space TS, s/he must satisfy the condition For future facilitating of the reading, Table 1 presents the nomenclatures proposed in our paper.

Topic Space Construction and Impact Evaluation Method
We first address the method of structuring topic space for TUG discovering.As mentioned above, only influential and close related topics can be selected for organizing the topic space, and thus our method of topic space construction is based on evaluation of the relatedness degree and influence degree for topics in social network.
As defined in Definition 2, topic space is composed of a set of topics and each topic can be described as two elements: semantic (content) and impact level (impact).In our consideration, the evaluation of relatedness degree can be measured from semantic perspective, while the impact degree comprises two aspects as influence degree and popularity degree.Accordingly, there are the following three aspects for detecting topic space: relatedness clustering, influence evaluation, and popularity evaluation.

Relatedness Clustering for Topic Space.
Relatedness clustering aims to find out topics which have close relations and then form a strong related topic set.That is, irrelevant topics should be removed from a topic space because they have few correlations with those topics in topic space and might contribute a little to attracting users' interests to join TUG.Hence, we here provide a method called relatedness clustering for topic space.
We propose a factor, denoted as relatedness degree, to describe the closeness of topics' relations.Assume that there are two topics, Θ  and Θ  , and their corresponding sample sets, which include topic related posts, comments, or other items, are (Θ  ) and (Θ  ), respectively.Then, the relatedness degree of two topics can be calculated by Jaccard similarity as follows: It is noteworthy that the result of relatedness degree is impacted by the sample sets of topics.That means different sample sets would result in different relatedness degree.Therefore, we propose an iterative algorithm for stabilizing the relatedness degree as shown in Algorithm 1.
Based on Algorithm 1, we can get relatedness degree factor of every two topics for measuring their closeness relation.Further, we can detect close related topics and then cluster them based on the relatedness degree factor.
Assume that there is a set of candidate topics, Can Θ = {can Θ 1 , can Θ 2 , . . ., can Θ  }, for discovering topic space.Then, the relatedness degree of every two topics can be calculated based on Algorithm 1.We can get a relatedness degree matrix as follows: Algorithm 1: Relatedness degree stabilizing algorithm for topics.According to the relatedness degree matrix, we can see that the candidate topic with maximum value of sum of its column values would be the topic which has the highest relevance with all other candidate topics.Here, we denote the topic with maximum value of sum of its column values as Topic Space Kernel.Then, topic space can be clustered based on the Topic Space Kernel.We propose a relatedness clustering algorithm for topic space as shown in Algorithm 2.
In Algorithm 2, close related topics are discovered and thus clustered to form TS. The step of ((Can Θ)) indicates utilizing Algorithm 1 to stabilize the relatedness degree values of topics.Also,  ∈ [0, 1] is a threshold which is given in advance.

Influence Evaluation for Topic Space.
According to the core content, parent, and subtopic sets, influence degree of a topic can be evaluated from the above parts.That means influence degree is an integrated value which is calculated based on importance of core content, parent, and subtopic.Here we address a method, called topic influence rank (TIR) method, which is similar to PageRank [33].Assume that there are different topics in SNS; TIR method works by counting the number and quality of relationships to a topic to determine a rough estimate of how important the topic is.That is, more influential topics are likely to have more relationships with other topics.As shown in Figure 2, there are two kinds of relations between two topics.(1) The first kind is link relation (solid lines in Figure 2(a)).That is, there are semantic relationships among topics and the topics are linked through their inner links in topic oriented web pages.Here, the topic oriented web page means a page whose content is mainly about a specific topic.For example, a page including a text about topic of "music" can be seen as a "music oriented page."Most topic oriented web pages are categorized manually.(More specifically, there are the following types: (1) if a topic is included in the title of the text in a page, the page is marked as a topic oriented page; (2) if a topic is included in the keywords or label words of the text in a page, the page is marked as a topic oriented page; (3) the text in a page is marked as a topic oriented one through semantic analysis technologies (owing to length limitation of the paper, detail semantic analysis technologies are discussed in other works).All the data is prepared through the data preprocessing in this work.)For example, topics of "pollution" and "disease" are in semantic relation since there are inner links among their pages.(2) The second type is hierarchy relation (dotted lines in Figure 2(b)).Such relation is also called parent-child relation.In our consideration, the subtopics or parent topics can bring their contributions to the topic which has semantic containment relations.For example, topics of "pollution" and "air pollution" are in semantic containment relation and the two topics contribute their impact degrees to each other.
We first propose the TIR calculation method of semantic incompatible relation.In this case, all topics have completely different semantics of core contents.Assume there is topic set as {Θ 1 , Θ 2 , . . ., Θ  }, and, for each topic Θ  , it has a page set as (Θ  ) = {(Θ  ) 1 , (Θ  ) 2 , . ..}.Then, for each (Θ  )  ∈ (Θ  ), it can calculate its TIR degree of link relation, TIR in((Θ  )  ), as follows: where Linkin((Θ  )  ) is the set of topics that have a link to page (Θ  )  ; Linkout(  ) denotes the set of pages that have links from page   ; and |Linkout(  )| denotes the number of topics in set Linkout(  ).Similar to damping factor set in PageRank, we set a parameter  here to describe the probability of topic change.Additionally, we give a weight factor, weight(  ), for page   ∈ Link((Θ  )  ) to distinguish the importance of pages as follows: (1) if the page   belongs to same topic with page (Θ  )  , which means the link from   to (Θ  )  is an inner link, page   would contribute less influence than the page which has external link with (Θ  )  and (2) a page belonging to a more influential topic would bring more contributions to the pages in its Linkin( ) set.The example is shown as in Figure 2(a) and we can see that there are three topics (, , and ) and their pages and links.Then, TIR in((Θ  )  ) can be calculated iteratively and finally can be convergent to stable values (suppose that  is set as 0.2) as follows: TIR in ( ( In addition, we calculate TIR degree of hierarchy relation.Likewise, we assume that if a topic's subtopics or parents have higher TIR degrees, the topic would get a higher TIR degree.Let the subtopic set of a topic Θ  be Subtopic(Θ  ) = {st 1 , st 2 , . . ., st  } and let the parent set be Parent(Θ  ) = {pt 1 , pt 2 , . . ., pt  }.The TIR value of Θ  for hierarchy relation, TIR con(Θ  ), can be calculated as follows: where |Subtopic(pt  )| is the number of subtopics of topic pt  and |Parent(st  )| is the number of parent topics of topic st  .Meanwhile, TIR con(Θ  ) is impacted by the relatedness degree between topic and its subtopic or parent topic.The value of TIR con(Θ  ) could be 0 while a topic has no parent or subtopic.For example, there are three parent topics (, , and ) and two subtopics ( and ) of topic , and then value of TR con() in Figure 2(b) can be calculated as follows: TIR con () = In summary, influence degree value of a topic by TIR can be calculated based on the above two equations:

Popularity Evaluation for Topic Space.
Popularity is another significant criterion for topic influence evaluation.In our consideration, the underlying assumption of popularity evaluation is that the more popular a topic is, the more influential it would be.Hence, we calculate the popularity of a topic based on its related data, including user number, propagated communities, average browsing time, and lasting time.
We first propose several types of topic related data for popularity evaluation in SNS as follows.
(1) Number of followers: follow(Θ  ) denotes the number of users who follow the topic Θ  .
(2) Number of communities: community(Θ  ) denotes the number of communities in which the topic Θ  is propagated.
(3) Browsing time: browse(Θ  ) denotes the average length of time that users spend on topic Θ  .
(4) Lasting time: last(Θ  ) denotes the length of time that the topic Θ  keeps hot in SNS.
(5) Activity: a topic Θ  is active in a time slice if and only if it is posted, followed, browsed, or propagated or wins other social behaviors in SNS.activity(Θ  ) denotes the activity level of topic Θ  in each time slice.
All the above types of data are available through specific collection methods in SNS.In this paper, popularity evaluation is the average of five indicators produced by the above five types of topic related data.Let the total numbers of user and community in SNS be  and , respectively.The maximum lengths of browsing time and lasting time of all topics in SNS are |max browse| and |max last|, respectively.Then, the popularity level, popular(Θ  ), of topic Θ  can be calculated as follows: In the above equation, activity(Θ  ) = /, where  is the number of time slices in the life cycle of topic Θ  and  is the number of time slices in which Θ  keeps active status.
Through the above two aspects of impact evaluation, that is, influence and popularity, we can get the total impact of a topic in SNS as follows: Furthermore, impact of whole topic space is measured based on its included topics.Let there be a topic space TS = {Θ 1 , Θ 2 , . . ., Θ  }, and the impact of each topic Θ  is impact(Θ  ).Then, the impact of the topic space is a comprehensive evaluation of all topics in TS as follows: (1) Atomic Trust Chain.A trust relationship between users is an atomic trust chain if and only if there is a direct link between two nodes and no intermediate node between them.

Trust Chain Model and Its Computation Method
(2) Serial Trust Chain.A trust relationship between users is a serial trust chain if and only if there is a serial path from source node to target node and the path has the following properties: (1) for source node, its out-degree is 1 and indegree is 0; (2) for the target node, its out-degree is 0 and indegree is 1; (3) for each intermediate node, its out-degree is 1 and in-degree is 1.
(3) Parallel Trust Chain.A trust relationship between users is a parallel trust chain if and only if there are two or more trust paths from source node to target node and there is no intersection node among the paths, and the path has the following properties: (1) for source node, its out-degree is  ( ≥ 2) and in-degree is 0; (2) for target node, its out-degree is 0 and in-degree is  ( ≥ 2); (3) for each intermediate node, its out-degree is 1 and in-degree is 1.
(4) Combined Trust Chain.A trust relationship between users is a combined trust chain if and only if the trust chain is composed of the above three kinds of trust chain.Furthermore, we here classify the trust chain between users into two types according to the mutual trust degrees as follows.
(1) Strong Trust Chain.A trust chain is a strong one if and only if there are two mutual accessible trust chains between two users and the trust degrees of the both trust chains are all higher than a given threshold ( 1 ∈ [0, 1]).
(2) Weak Trust Chain.A trust chain is a weak one if and only if there is a trust chain higher than a given threshold (( 2 ∈ [0, 1]) ∧ ( 2 ≤  1 )) from source node to target node and such trust chain is not a strong one.
In our definition, a strong trust chain reveals a mutual high trust relationship between two nodes, while a weak trust chain reflects a unidirectional trust degree or a bidirectional trust degree with a relative high value.

Atomic Trust Chain.
In atomic trust chain, there is no intermediate node between two nodes.Then, we can calculate the trust degree through their direct trustworthy interactions and their interest similarity.Let there be two nodes  1 ,  2 , and   ( 1 ,  2 ) ∈ [0,1] denotes the trustworthy opinion which is expressed by  1 to  2 .Assume that there is a common topic set, CT = {Θ 1 , Θ 2 , . ..}, which denotes the set of topics of interest by both nodes.For each CT ⋅ Θ  ∈ CT, the number of nodes that maintain explicit interest degrees to it is num(CT ⋅ Θ  ), and the maximum number of nodes that maintain explicit interest degrees to all the topics in social network is max top.Then, the degree of atomic trust chain from  1 to  2 can be calculated as follows: where trust interact( 1 ,  2 ) denotes the trust degree based on nodes' direct interactions and sim interest( 1 ,  2 ) is the similarity degree based on nodes' interests.Here, we use the factors, num(CT ⋅ Θ  ) and max top, to reveal that a topic which has the less number of explicit interested nodes would give more contributions to nodes' interest similarity calculation.

Serial Trust Chain.
In serial trust chain, we give a constraint for its composition as follows.
Constraint 1.Each atomic trust chain part in serial trust chain must be a strong atomic trust chain or a weak atomic trust chain.
That is, an atomic trust chain part with a low trust degree would be excluded from the serial trust chain, and thus the indirect path from source node to target node cannot be considered as a serial trust chain.
Let there be source node   , target node   , and intermediate node    in serial trust chain and  (  ,   ) which denotes trust value of atomic trust chain part in the serial trust chain.Therefore, we can calculate the trust value of serial trust chain as follows: Here function depth(  ) denotes the depth of serial trust chain; namely, depth(  ) = |Ω ⋅   | + 1.We can see that the deeper the depth of serial trust chain is, the weaker the trust value among users is.That means longer serial trust chain would be punished since the trust would be damped with the number of intermediate nodes increasing.In addition, we give a parameter (   ,  +1  ) for distinguishing the weights of strong and weak atomic trust chain parts in serial trust chain.

Parallel Trust Chain.
In parallel trust chain, there are at least two serial paths without intersection from source node to target node.In addition, we here present a constraint for ensuring the reliability of parallel trust chain as follows.
Constraint 2. A serial path can be seen as a serial trust chain in a parallel trust chain if and only if it is a strong or weak serial trust chain.
From Constraint 2, we can divide the serial trust chains into two types as follows: (1) the strong or weak serial trust chains, called active serial trust chains, are taken into consideration in parallel trust chain evaluation; and (2) the serial trust chains with low trust degrees, called inactive serial trust chains, would be excluded in trust degree calculation of parallel trust chain.However, the number of inactive trust degree serial trust chains, in which all the atomic trust chain parts are strong or weak ones, is also used in parallel trust chain evaluation since their low trust degrees also reflect the untrustworthy perspective of parallel trust chain.
Let there be  1 ( 1 ≥ 2) serial trust chains in the parallel trust chain from source node   to target node   .Assume that there are  2 ( 2 ≤  1 ) active serial trust chains in parallel trust chain.  (0 ≤  ≤  2 ) denotes each serial trust chain, and ser (  ) represents trust degree value of serial trust chain   .Then, the trust degree of parallel trust chain from   to   , par (  ,   ), can be calculated as follows: is a strong serial trust chain 0.9   is a weak serial trust chain. ( In the above equation, trust degrees of parallel trust chain are calculated as follows: (1) trust degrees of active serial trust chains are calculated by weighted average method ((  ) is the weight of each active serial trust chain) and (2) number of inactive serial trust chains degrees are considered as negative aspects and thus are used to weaken the trust degree of parallel trust chain by exponential weighting as in (33).We can see that the higher ratio of effective serial trust chains in parallel trust chain and the lower ratio of inactive serial trust chain to active serial trust chain imply a higher exponential weighting.

Combined Trust Chain.
Combined trust chain includes crossing paths which are above three kinds of trust chains.We introduce an iterative optimizing approach for combined trust chain, called IOA, by including strong or weak trust chain part and excluding other parts.The constraint of the proposed approach is as follows.
Constraint 3.There are the following four rules for IOA.
(i) Local trustworthy rule (LTR): for each atomic path in combined trust chain, it can be seen as an active atomic trust chain for combined trust chain if and only if it is a strong or weak trust chain.That is, those atomic paths which are not strong or weak trust chains from nodes to their neighbors can be ignored in the combined trust chain, and thus their successor paths are ignored due to the breakage occurring.
(ii) Serial trustworthy rule (STR): for a serial path from   to   , it can be seen as active trust chain if and only if it satisfies Constraint 1.
(iii) Serial merging rule (SMR): if there is a combined trust chain from   's indirect neighbor nodes to   , it would be merged as a serial one iteratively.
(iv) Parallel calculating rule (PCR): if there are two or more direct neighbor nodes which have serial trust chains from   to   , the combined trust chain can be reconstructed as parallel trust chain with its neighbors iteratively if and only if the reconstructed parallel trust chain satisfies Constraint 2.
An example of our scheme is shown in Figures 3(a)-3(e).In Figures 3(a) and 3(b), we can see the parallel paths and the inverse paths from source node  to target node , while the intermediate nodes are , , and .Assume that the thresholds,  1 and  2 , of strong trust chain and weak trust chain are set as 0.6 and 0.5, respectively.Then, we can get the strong atomic trust chains ( → ,  → ,  → ) and weak trust chains ( → ,  → ), while the path  →  is an ignorable path due to its low trust degree.Then, we can calculate the trust degrees of serial paths ( →  → ,  →  → ) based on (32) and Constraint 1 as follows: ser  (, ) →→

User Influence Evaluation through Trust Chain.
Each user has his/her influence in SNS.Here, we can evaluate user influence by measuring the users who maintain high trustworthiness in the trust chain.The more the users who trust source user in the trust chains, the more influential the source user.For a user V, let there be a trust chain, Ω(V), in which user V is the source node.We denote that the valid impacted user, iu  , is the node which satisfies the following condition in trust chain Ω(V): Assume that there are  trust chains, Ω(V)  , in which user V is the source node, and the valid impacted user set in each trust chain is denoted as Ω(V)  ⋅ IU.Then, the user influence can be evaluated as follows:

User Explicit Interest Degree Evaluation towards Topic.
Explicit interest degree can be measured through users' direct behaviors or other direct witness evidences.In this study, we denote these direct items for evaluating explicit interest degree as interest evidences.For example, the behaviors of a node, such as "forwarding," "approving," "following," and "comments," can be seen as interest evidences.In our work, we have the following considerations for user explicit interest degree evaluation towards topic.(1) Explicit interest is measured by the level of each interest evidence and its weight.This means explicit interest is aggregated by users' past interest evidences, and, meanwhile, interest evidences have different impacts in interest aggregation.(2) Explicit interest is impacted by the impact degree of the topic.That is, a more influential topic would attract more users to browse it.From this point, we consider that if a user shows his/her explicit interest towards a topic with lower influence, he might have more interests in the topic.Then we address the calculation method of explicit interest degree towards a topic.Assume that user V has his/her past different kinds of interest evidences as  1 ,  2 , . . .,   , and (  ) denotes the appearance probability of   in all kinds of interest evidences.Meanwhile, for a topic Θ, assume that user V did  times of interest evidences which were recorded as a set Φ = { 1 ,  2 , . . .,   , . ..} and (Θ |   ) denotes the probability of interest evidence category   appearing when the user faced the topic Θ.The weight of each   is right(  ) (right(  ) ∈ [0, 1]).Then, user node V would have an explicit interest degree about the topic Θ when it appears in next future as In the above equation,  V (  | Θ) denotes the probability of appearing interest evidence   for topic Θ and it can be calculated as Then, we give the calculation method of weight right(  ).In our view, there are inherent relations among interest evidences: many interest evidences appeared simultaneously and serially.For instance, interest evidence of long time browsing would be likely to lead to interest evidences of "approving" or "adding to the favorites list."Then, their impacts are related to each other, and these relations can be enhanced in their increasing appearance.From this view, our underlying principle of interest evidence weight calculation is similar to the method of PageRank as more important interest evidences are likely to be related to other more important trust evidences.Here, interest evidence  caused by another interest evidence  is conveniently written as a link  → .From this, we can calculate the weight of trust evidence as follows.
Let there be an interest evidence set as ID = { 1 ,  2 , . . .,   }, and the set of interest evidences that can link to   is denoted as Link(  ) = {  | ∃(  →   ) ∧ ( ̸ = )}.Then, the equation of calculating link-weight of interest evidence   is as follows: where (  ) is the probability that   occurred in users' past behaviors and (  ) is the number of interest evidences which link out of   .Moreover, user's interest is a dynamic feeling and keeps changing with the time-passing.Here, we propose a dynamic predicting method based on aging algorithm for describing the interest changing.Suppose the original explicit interest degree calculated at the end of time quantum  0 is EI(V | Θ) 0 .Now, suppose the next value of explicit interest is changed as EI(V | Θ) 1 at the end of next time quantum  1 .Then, we can consider that the explicit interest degree of topic Θ is an integrated value of the past two values and, then, renew the dynamic predicting value by taking a weighted sum of these two numbers; that is, By analogy, the dynamic estimating value of explicit interest degree can be calculated as In this work, we have the following constraint for interest degree.
Constraint 4. A user has an explicit interest degree toward a topic if and only if his explicit interest degree value is larger than a given threshold .

Implicit Interest Degree of Topic.
Implicit interest degree reflects a kind of users' potential opinion toward topic.In our consideration, our proposed implicit interest evaluation mainly focuses on solving the problem of interest prediction without direct evidences or data.Thus, in our proposed work, there are two main considerations for implicit interest evaluation.(1) Implicit interest manifests a likelihood of a user's potential interest while he/she has not shown any explicit interest or direct evidence before, such as a new register.Therefore, the only direct evidence for predicting the interest of such user is his/her relationships with others.Of course it is impossible that all the relationships can be used to reflect and predict his/her potential interest.Consequently, we in this work use the relationship with "high trustworthy degree" for implicit interest evaluation.That is because the relationship with high trust worthy degree is established based on users' past interactions and experience, which can reflect the similarity between users in a higher level.(2) Trustworthy relationship based implicit interest has a performance of relatively low computational overhead.The reason is that by evaluating the implicit interest through trust relationship the proposed method only relies on the data of trustworthy relationships and their degrees among users rather than other detail data of users.That is, for each evaluation, it only needs to inquire the trust relationships and degrees among users, which avoids querying detail records, whether or not it has to do with the implicit interest evaluation, and storing such data.
In this study, there are two factors for implicit interest evaluation-trust and user similarity.Trust degree reflects the probability of user's approving or against attitude through his trust relationships.Therefore, users would tend to be influenced by those who they trust.On the other hand, user similarity reveals the common interests between users.It means users would be easily influenced by those who have similar interests with them.For example, a user might have potential interests to a topic because the topic is his/her best friend's favorite topic and most of the time the user has common interests with his/her friend to other topics.Thereby, we give an estimating method for evaluating the implicit user interest degree according to the user trust chain model and their interest similarity.This approach comprises two steps: firstly, we calculate the value, called deliverable implicit interest degree, one-to-one from a source node to target node based on their trust chain, explicit interest degree, and user interest similarity; then, we integrate deliverable implicit interest degrees to target node from all source nodes which have trust chains with target node.In addition, there is a following constraint for implicit interest degree of topic.
Constraint 5.The explicit interest degree of a source node can be used for implicit interest evaluation if and only if there is a strong or weak trust chain from target node to the source node.
Let there be a trust chain Ω(  ,   ) from target node to source node at time quantum  0 , and the original explicit interest degree of   toward a topic Θ is EI   (Θ) 0 .Suppose that the original trust value of Ω(  ,   ) is (  ,   ) 0 and then the original deliverable implicit interest degree from   to   is MI(  | Θ)   0 which can be calculated as MI (  | Θ) where sim interest(  ,   ) denotes the user interest similarity between   and   about all their other common explicit interested topics, and (Ω(  ,   )) is the weight for distinguishing the impact of strong and weak trust chain.Now suppose the trust value of Ω(  ,   ) and explicit interest degree of   are measured to be (  ,   ) 1 and EI   (Θ) 1 at the end of next time quantum  1 .Then, we can renew the estimation value of implicit interest degree by taking a weighted sum of the new values as By analogy, we can get the equation for deliverable implicit interest degree after  quantum as After finishing the first step of deliverable implicit interest degree estimation, we can calculate the overall implicit interest degree from all the source nodes to target node.Let there be a set of source nodes,  = { Assume that there is topic space TS = {Θ 1 , Θ 2 , . . ., Θ  } and user V has explicit interest degree (or implicit interest degree) of topic Θ  as EI(V | Θ  ) (or MI(V | Θ  )).Then, user interest degree of topic space can be calculated as

TUG Discovering Algorithm Based on Trust Chain
In this work, the basic idea of single topic user group organization is as follows: firstly, influential topic is selected based on impact evaluation; secondly, core users, who have strong interests in the specific topic, should be discovered; then, ordinary users, who keep certain explicit or implicit interests in the topic, should be organized through trust chains.Correspondingly, we set two sets,   and   , for recording the above two types of users.Based on the above consideration, we first propose the influential topic discovering algorithm as shown in Algorithm 3.
Here, the topic space, which has impact values larger than average of the whole topic space set  = {TS 1 , TS 2 , . ..} and where also the numbers of follower users are not equal to 0, would be selected as influential topics.Then, for each topic space TS  ∈ , we propose the algorithm of core user discovering for TUG to discover core users from a candidate set Candidator ⊆ SN ⋅  as shown in Algorithm 4.

Experiment and Analysis
In this section, we give examinations to explain the performances of our proposed scheme.In our scenario of examinations, the data comes from a real social network platform, Tencent microblog, which is very popular in China and the dataset is collected manually.Our data included about 3,750 IDs (some IDs located in two or more communities) and more than 457,000 records, including posts, comments, and users' behaviors (browsing, approving, forwarding, and others).The topology of collected dataset is from the users' real relationships and the average out-degree of a node is about 9. We select eight kinds of topic, that is, education, entertainment, sporting, technique, financial, food, touring, and history, for user group discovering experiment.Details about initial data setting are shown in Table 2. Additionally, we also get about 780 robot nodes by our previous data processing method in the dataset and we call them invalid nodes, which only follow real users, forward posts, or hit approving automatically and should be excluded in user grouping.

Examination for Topic
Algorithm 5: Ordinary user discovering algorithm for TUG.To verify the performances of comparing groups, we set nonnoise data and 20% and 30% noise data sets, respectively, where the noise data sets include the topics that did not belong to any of the above eight kinds.The results are shown as in Figures 4(a)-4(c).We can see that the performance of our proposed method outperforms other three groups a little.And further, we analyze the impact of two thresholds,  and , in Algorithms 1 and 2. As shown in Figure 4(e), the average accuracy of relatedness between topics based on Algorithm 1 is decreased with the increasing of value setting of threshold .The reason is that a higher value of  would allow Algorithm 1 ending with less stable relatedness values between topics.In addition, we can see that the average accuracy is lower while the value of  is set too low or too high.We consider that a too low value of  leads to unrelated topics being grouped in topic space while a too high value of  leads to appropriate topics being excluded from topic space.This test validates that the thresholds around 0.1 and 0.7 are often a reasonable compromise for  and .

Performance of Topic Space Impact Evaluation.
In this examination, we reveal the performance of the proposed topic space impact evaluation method.We give about 240 topic spaces which are grouped by proposed space clustering method for evaluating their impact degrees.All topics in topic spaces are contained in the initial data set.We set six impact evaluation groups in different methods for comparison as follows: only link relation based impact calculation (LR), only hierarchy relation based impact calculation (HR), only proposed influence degree based impact calculation (ID), only proposed popularity degree based impact calculation (PD), linear threshold model based method (LT), and our proposed integrated impact degree method (TP).And then, we record the average precisions of six groups as in Figure 5(a).Here, the optimized results of topic impact evaluation are set manually in advance, and if the impact evaluation results of six groups are larger or smaller than optimized results within 0.1, the evaluation results are seen as accurate.We can see that LT method and our proposed TP method get similar performances, which both outperform other methods.Meanwhile, we can see that the precision of TP keeps increasing with the increasing of topic number in topic space as in Figure 5(b).That means that as the topic space includes more topics, the data for impact evaluation would be enriched more, which results in better performance in impact evaluation.

Performance Evaluation of Proposed Trust Chain Calculation.
In this examination, we get the accuracy of predicting trustworthiness among users through invalid node detection.In this test, we use the trust degrees between real users and invalid users for testing the accuracy of invalid node detection.For a real user, if he/she has a strong or weak trust chain with an invalid node, the trust relationship between them is wrong.For comparison, we introduce the following groups: atomic trust chain (ATC), serial trust chain (STC), parallel trust chain (PTC), hybrid trust chain (HTC), EigenTrust method (ET) [28], weighted average trust rating (WA) [30], and ultimate trust rating (UT) [31].Figure 6(a) reveals that the UT method gets the best performance, while our trust chain methods including ATC, STC, PTC, and HTC get similar performances and are better than methods of ET and WA.We consider that the reasons are as follows: (1) the ultimate trust method provides trust calculation by its dynamic adjusting of factor by large computational costs, which results in all users maintaining trustworthy knowledge about others for detecting malicious interactions, and (2) trust chain methods can provide more comprehensive evaluation for relationships among users through proposed factors under relative low costs.In addition, we compare the performance of proposed trust chains (ATC, STC, PTC, and HTC) with strength perspective (strong and weak trust chain) and trust chains without strength constraints.The results are shown in Figure 6(b), and we can see that the performances of proposed trust chains are obviously higher than the methods without strength constraints.With respect to value setting of thresholds  1 ,  2 , we have that values around 0.7 and 0.5 are reasonable compromise according to our empirical testing.

Examination of User Interest Degree Measurement.
In this example, we examine the performance of proposed interest degree for topic space.Firstly, we test the effects of explicit interest degree calculation.We calculate users' explicit interest toward selected topic spaces and then compare the explicit interest quality of the following groups: view-time based interest (VT), weighted average of interest evidence based interest degree (WA), the explicit interest without dynamic predicting (ND), and the proposed explicit interest quality of TI, NN, EI, and RC.As shown in Figure 8, NN, EI, and RC obtain lower performance than TI in all eight kinds of topics, respectively.We consider that the reasons are as follows: (1) TI is an interest sensitive approach which adopts both explicit and implicit interests as a key criterion for user grouping, while NN and RC do not take interest into account and EI only considers the explicit interest; (2) in TI, users are grouped within a close related way, strong or weak trust chain, for manifesting their close correlations, which decreases the error rate in ordinary user discovering.
Furthermore, we examine the effectiveness of user group discovering through online topic recommendation quality.In this examination, we recommend manually preprocessed topics, including both close related and irrelevant topic space, and then record the quality of recommendation.In the recommendation, the above mentioned topics are recommended to users who are in four kinds of user groups that are organized by methods of TI, with NN, EI, and RC.Then, if a user performs behaviors of hitting, reading, forwarding, or posting judgments on his/her received topic, the recommendation is recorded as an accurate one.The precision of a recommendation to a user group is calculated as the average of all members in the user groups.There are six user groups corresponding to each kind of organizing method, respectively, and about 120 times of recommendation were launched in our experiment.We record the average precision of all kinds of user groups and the results are shown in Figure 9.We can see that recommendation quality of user group discovered through our proposed method is better than other three methods.That means close related topics are accepted and irrelevant topics are excluded by users in user group based on TI, which implies higher accuracy and better efficiency than user groups through NN, EI, and RC.

Conclusion
For users in SNS, joining a user group to facilitate their communication is very common and inevitable.Commonly, the user groups are organized based on a set of topics having close internal correlations.However, most existing researches focus on user clustering based on their relation closeness degree and common explicit interests, while few efforts have been paid on users' interest interaction and expanding.Establishing user group based on influential topics has its practical significance.First, our work provides a valuable guideline on describing the topic oriented user group and computation methodology of user group organized formally and specifically for users.Each user can inquire about details of topic space, including influence degree, members of group, and estimating user interests for his/her further purpose of information communication.In practice, a machine driven mechanism can achieve higher efficiency than manual methods to reduce the overload of user group discovering and organizing.In addition, our proposed formal definitions and conceptions are quite appropriate for machine reading and understanding the calculation methods.Secondly, application of the influence and popularity helps us aggregating all aspects to get a more comprehensive impact degree about topics and topic spaces.This is because the proposed influence degree of topic reflects the structural closeness directly which also shows the integrated impacts about topic more or less, while proposed popularity degree reflects how influential a topic is from direct data through impartial views.Thereby, our proposed method can provide sound usability for influential topic space discovering.Thirdly, the proposed trust chain model shows acquiring direct and indirect trust among users and acquiring strength of indirect trust (strong or weak trust chains) for users.Meanwhile, the topological information of trust chain is fully considered in this work.That is, we calculate trust degrees of trust chain in different route compositions corresponding to the trustworthy

Figure 1 :
Figure 1: Overview of topic user group discovering.

Figure 2 :
Figure 2: Example of topic influence rank evaluation for topics.

Algorithm 2 :
Relatedness clustering algorithm for topic space.

5. 1 .
Trust Chain Model.Here, we propose the model of trust chains in detail based on their different network topologies and their trust value calculation methods.We divide trust chain into four types based on the topology.

Figure 3 :
Figure 3: An example for trust chain calculation.

Figure 5 :Figure 6 :
Figure 5: Effects of topic space impact evaluation.
Topic space oriented user group Δ = (TS, , ) User group organized according to a topic space (Definition 5)Topic related termsTopic space TS = ⟨{Θ 1 , Θ 2 , ...} , imp(TS)⟩ A set of topics which have large impacts and close relations (Definition 2)User interest  = { 1 (TS),  2 (), ...}The quantified value of user's interest level about a specific topic or a topic space (Definition 3) 1  ,  2  , . ..} which satisfy Constraint 4, and each node in  has trust chain from target node   .Then the overall implicit interest degree can User Interest Degree of Topic Space.User interest degree of topic space is comprised of user interest degrees of topics which are included in topic space.Here, we use weighted average method for calculating interest degree of topic space.

Table 2 :
Characteristics of five communities in examination prototype.