1. Introduction

CIN

Computational Intelligence and Neuroscience

1687-5273 1687-5265

Hindawi Publishing Corporation

10.1155/2016/5403105

5403105

Research Article

Exploring the Combination of Dempster-Shafer Theory and Neural Network for Predicting Trust and Distrust

Wang

Xin

^1,2,3,4 Wang

Ying

^2,3 Sun

Hongbin

¹ Graña

Manuel

School of Computer Technology and Engineering

Changchun Institute of Technology

Changchun 130012

China

ccit.edu.cn

College of Computer Science and Technology

Jilin University

Changchun 130012

China

jlu.edu.cn

Key Laboratory of Symbolic Computation and Knowledge Engineering

Ministry of Education

Changchun 130012

China

moe.edu.cn

⁴

Guangxi Key Laboratory of Trusted Software

Guilin University of Electronic Technology

Guilin 541004

China

gliet.edu.cn

2016

2812016

2016 18 10 2015 29 12 2015 29 12 2015

2016

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In social media, trust and distrust among users are important factors in helping users make decisions, dissect information, and receive recommendations. However, the sparsity and imbalance of social relations bring great difficulties and challenges in predicting trust and distrust. Meanwhile, there are numerous inducing factors to determine trust and distrust relations. The relationship among inducing factors may be dependency, independence, and conflicting. Dempster-Shafer theory and neural network are effective and efficient strategies to deal with these difficulties and challenges. In this paper, we study trust and distrust prediction based on the combination of Dempster-Shafer theory and neural network. We firstly analyze the inducing factors about trust and distrust, namely, homophily, status theory, and emotion tendency. Then, we quantify inducing factors of trust and distrust, take these features as evidences, and construct evidence prototype as input nodes of multilayer neural network. Finally, we propose a framework of predicting trust and distrust which uses multilayer neural network to model the implementing process of Dempster-Shafer theory in different hidden layers, aiming to overcome the disadvantage of Dempster-Shafer theory without optimization method. Experimental results on a real-world dataset demonstrate the effectiveness of the proposed framework.

1. Introduction

With the pervasiveness of social media, more and more users participate in various online activities in an unprecedented rate. Trust plays an important role in helping online users make decisions, dissect relevant and reliable information, and receive recommendations. Distrust represents strong negative feelings and insecurity about a user’s motivation, intention, and behavior. Fortunately, online social media have made it easy for users to indicate whom they trust and whom they do not. However, trust and distrust relations are often too sparse to be accurately predicted [1]. In addition, the finding in [2] shows that the number of negative links is much smaller compared to positive links. The imbalance and sparsity of social relations bring great difficulties and challenges in predicting trust and distrust.

In essence, constructing trust and distrust relations is a complex cognitive psychology process, which is involved with many theories, such as sociological theory, cognitive theory, and psychological theory. Meanwhile, there are numerous inducing factors to determine trust and distrust relations. However, classical classifiers do not fully take into account the relationship among inducting factors and result in the performance degrading dramatically. The relationship among inducing factors may be dependency, independence, and conflict. Dempster-Shafer theory [3] is an effective and efficient strategy for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility, and imprecise probability theories. The theory allows combining evidences from different sources and arrives at a degree of belief that takes into account all the available evidences from multiple information sources. While neural network is generally presented as systems of interconnected “neurons” which exchange messages between each other, the connections have numeric weights that can be tuned based on experience, making neural nets adaptive to inputs and capable of learning. Therefore, exploring the combination of Dempster-Shafer theory and neural network can potentially improve the performance and bring new opportunities for trust and distrust prediction.

In this paper, we study predicting trust and distrust based on Dempster-Shafer theory and neural network. In essence, we investigate how to model Dempster-Shafer theory and neural network mathematically and how to incorporate them into predicting trust and distrust. Our contributions are summarized as follows: (i)

Analyze and quantify inducing factors about trust and distrust, namely, homophily, status theory, and emotion tendency, and verify the research motivations that these inducing factors are contributed to predict trust and distrust.

(ii)

Construct evidence prototype based on these inducing factors as input nodes of multilayer neural network, whose advantage is to improve the reliability of evidence features and simplify the complexity of establishing Basic Belief Assignment (BBA) in Dempster-Shafer theory.

(iii)

Investigate multilayer neural network to simulate the combining evidences process from different inducing factors in Dempster-Shafer theory, which not only achieves complex decision-making process, but also overcomes the disadvantage of Dempster-Shafer theory without optimization method.

The rest of the paper is organized as follows: Section 2 introduces the related works. Section 3 depicts the inducing factors about trust and distrust. Section 4 explores the novel approach for trust and distrust prediction. Section 5 reports experimental results and performance evaluation. Finally, we summarize our conclusion and suggest possible future directions in Section 6.

2. Related Works 2.1. Trust Prediction

Trust prediction, which explores unknown relations between online users, is an important research topic in social network analysis. Existing approaches can be grouped into two categories: unsupervised methods and supervised methods.

Most unsupervised methods are based on trust propagation. Trust propagation models focus on developing a trust prediction model which propagates trust values through web of trust. Guha et al. [4] propose a framework of trust propagation schemes that several atomic propagations are introduced, such as direct propagation, cogitation propagation, transpose propagation, and trust coupling propagation, to reduce the sparsity of a web of trust. Kim and Song [5] propose a trust prediction model by discovering reliable trust paths between two users based on reinforcement learning. Oh et al. [6] propose a probability-based trust prediction model based on trust-message passing, which takes advantage of two kinds of information: explicit information and implicit information. However, due to just relying on web of trust, these works based on trust propagation are often too sparse to predict trust with high accuracy.

Supervised methods first extract features from available sources and then construct a classifier based on the labeled trust relations. Zolfaghar and Aghaie [7] first develop a framework of social trust inducing factors and map these factors to corresponding measurable features and then, respectively, apply C5.0 tree and neural net into predicting trust relations. Ma et al. [8] extract various features based on writer-reviewer interaction and apply these features into personalized and cluster-based classification methods. Liu et al. [9] propose a SVM based prediction model to infer the trust relation between two users solely based on their individual actions and interactions in an online community.

Most researches about trust prediction have not paid sufficient attention to the importance of distrust relation [10]. Therefore, links prediction in signed network is becoming an emerging research field.

2.2. Links Prediction in Signed Network

Signed network is a special network that creates positive and negative links between nodes. Link prediction in signed network infers new positive and negative links by giving old positive and negative links. Leskovec et al. [2, 11] first apply status theory into better explaining the differences between positive and negative links in online social networks and also use contrasts between balance theory and status theory to draw inferences about how links are being used in particular social computing applications. Yang et al. [12] investigate both unsupervised and semisupervised models to infer signed social ties by capturing the interplay between social relations and users’ behavior of decision making and extend the models to encode general principles form social psychology. Hsieh et al. [13] explore the low-rank matrix completion problem to infer sighed networks based on both balance theory and practical points of view. Agrawal et al. [14] propose a more efficient matrix factorization approach for sign inference that recovers the missing links by matrix completion algorithms under certain conditions.

2.3. Dempster-Shafer Theory

The Dempster-Shafer theory [3] is a mathematical theory of evidence, which allows one to combine evidences from different sources and arrive at a belief function by taking into account all the available evidences. As a more flexible mathematical tool, Dempster-Shafer theory not only combines with other mathematical frameworks [15–18], but also combines with classifiers [19–21] for dealing with imprecise and uncertain data. In [18], the authors propose an approach that addresses the issue of managing imprecise and vague information in evidential reasoning by combining the Dempster-Shafer theory with the fuzzy set theory. Xiao et al. [15] extend the research of multiple predictions, which use rough set to determine the weight of each single prediction method and utilize Dempster-Shafer theory as the combination method. Basir et al. [19] propose a systematic framework of DSET based on supervised learning for data fusion. They use Dempster-Shafer theory to address the issues of constructing evidence structures and handling dependence between information sources based on neural networks. Denoeux et al. [20, 21] propose an evidential neural network method for pattern classification problems. The assignment of a pattern to a class is made by the degree of support which is defined as a function of the distance between two vectors.

3. Data Analysis and Motivating Observations 3.1. Data Collection

We take Epinions as experimental dataset which is a famous online social network. Users in Epinions not only write a review by rating items with 1–5 stars, but also rate other reviews using some tags which range from “not helpful” to “most helpful.” In addition to user’s behavioral data, the trust and distrust relations are another advantage that users can express their web of trust and block list. Therefore, we first collect the available dataset for this research and pay more attention to direct or indirect social correlations, such as rating the same items, rating other users’ reviews, and common neighbours. Then, we delete these users with less than two in-degrees and further filter the users with less than two reviews and ratings, aiming to obtain datasets that are large enough and have sufficient information for the purpose of evaluation. Table 1 shows some statistics of the collected dataset.

Table 1

Statistics of the dataset.

	Epinions
# of users	9718
# of reviews	646201
# of ratings	9160113
# of relations	394595
# of trust relations	330671
# of distrust relations	63924

3.2. Inducing Factors of Trust and Distrust

In view of sociology and psychology, we divide the inducing factors of trust and distrust into different types of features which include homophily, status theory, and emotion tendency. These features refer to network structures and interaction behaviors in Epinions.

3.2.1. Homophily

Homophily is one of the most important social theories that a contact between similar people occurs at a higher rate than among dissimilar people [22]. We study homophily via the correlation between social relations and users similarity. In this paper, we investigate the following four widely used similarity measures for homophily [23].

Rating Similarity (RS). Rating similarity is equivalent to user’s behavior similarity, which can be measured by the cosine similarity of two rating vectors R i and R j based on distance. We define rating similarity as (1) RS u i , u j = ∑ k R i k · R j k ∑ k R i k 2 ∑ k R j k 2 .

Pearson Correlation Coefficient (PCC). Due to different users with different rating styles, namely, some users have the propensity to give higher rating to all items while others probably tend to rate lowly, the similarity of Pearson Correlation Coefficient is defined as (2) PCC u i , u j = ∑ k ∈ I u i ∩ I u j R i k - R i ¯ · R j k - R j ¯ ∑ k R i k - R i ¯ 2 ∑ k R j k - R j ¯ 2 , where R i j is the rating to the j th item from u i and I ( u i ) is the set of items u i rates. R i ¯ represents the average rate value of u i , and k denotes the subset of items rated by both u i and u j .

Jaccard’s Coefficient (JC). To calculate preference similarity between u i and u j , we model the set of items rated by u i and u j as two vectors I ( u i ) and I ( u j ) , respectively. Then, we utilize Jaccard’s Coefficient [24] to measure the similarity of user preference: (3) Jaccard’s Coefficient u i , u j = I u i ∩ I u j I u i ∪ I u j , where · denotes the size of a set.

Common Neighbours (CN). We exploit user’s in-degree and out-degree to measure similarity feature based on common neighbours: (4) UN u i , u j = α O u t u i ∩ O u t u j O u t u i ∪ O u t u j + 1 - α I n u i ∩ I n u j I n u i ∪ I n u j , where I n ( u i ) denotes the set of incoming links from other users to u i and O u t ( u i ) denotes the set of outgoing links from u i to other users. α denotes the weight coefficient of correlation between incoming links and outgoing links.

3.2.2. Status Theory

Status theory is developed to help us understand the important role of social status in the formation of trust and distrust relations. In the context of social networks with graph-based representation, we exploit some measures, such as PageRank, PolarityRank [25], and PolarityIndegree, which are suitable for calculating user social status.

PolarityRank is defined as (5) PolarityRank u i = P R + u i - P R - u i P R + u i + P R - u i , where P R + ( u i ) and P R - ( u i ) , respectively, denote the positive PolarityRank and negative PolarityRank of a node u i ; both are calculated by (6) (6) P R + u i = 1 - α n + α ∑ j ∈ I n + u i P R + u j O u t u j - ∑ j ∈ I n - u i P R - u j O u t u j , P R - u i = 1 - α n + α ∑ j ∈ I n + u i P R - u j O u t u j - ∑ j ∈ I n - u i P R + u j O u t u j .

PolarityIndegree. In the context of trust and distrust, users with larger positive in-degree not only are interpreted as a form of popularity, but also are a symbol of likely enough reliable information source. On the contrary, negative in-degree is also a major factor that leads to a negative impact. Therefore, we employ sigmoid function to measure PolarityIndegree based on positive and negative in-degree: (7) P I + u i = 1 1 + e - α I n + u i - μ , P I - u i = 1 1 + e - α I n - u i - μ .

Finally, we normalize P I + ( u i ) and P I - ( u i ) as the final strength of user’s social status by (8) PolarityIndegree u i = P I + u i - P I - u i P I + u i + P I - u i .

3.2.3. Emotion Tendency

Social interaction with emotion tendency (ET) is an effective and important way to overcome the problems of imbalance and sparsity of social relations. We explore two features, namely, ET of a v g R t r u s t o r and ET of a v g R t r u s t e e , to quantify emotion tendency through comparing user helpfulness ratings with average score of trustor’s ratings and trustee’s ratings. The helpfulness rating set is represented as R i j = { r i j 1 , r i j 2 , … , r i j m } . We, respectively, define positive emotion tendency E T + and negative emotion tendency E T - by measuring higher and lower than the average rating scores R - t r u s t o r and R - t r u s t e e : (9) E T + u i , u j = r i j k - R - > 0 R i j , E T - u i , u j = r i j k - R - < 0 R i j , where r i j k denotes a helpfulness rating that u i gives the k th review of u j and R - denotes the average rating of R - t r u s t o r or R - t r u s t e e .

4. Framework of Predicting Trust and Distrust 4.1. Problem Statement

The problem we study in this paper is to predict trust and distrust based on the binary classification through combining neural network and Dempster-Shafer theory in a supervised way. In this subsection, we first present the notations and then formally define the problem of classification on trust and distrust and depict the general framework in Figure 1.

Figure 1

The general framework of predicting trust and distrust.

We use uppercase letter X ∈ R n to denote an input set X = { x 1 , x 2 , … , x n } , lowercase letter x to denote an inducing factor of trust and distrust, and uppercase C to denote an output set C = { c 1 , c 2 , … , c m } . The final predicted trust and distrust relations are two class labels, denoted, respectively, as class c 1 and class c 2 . In Dempster-Shafer theory, we firstly define a frame of discernment about trust and distrust, which denotes Θ = { t r u s t , d i s t r u s t } . Then, each inducting factor x in input set X is mapped to an evidence prototype, which denotes E P . Finally, we construct Basic Belief Assignment (BBA, or called mass function) between evidence prototype E P and target classes c , which denotes m . In order to overcome the disadvantage of Dempster-Shafer theory without optimization method, we introduce multilayer neural network to dynamically adjust weights in the supervised learning way; all weights are denoted as ω .

With the notations above, we formally define classification problem of predicting trust and distrust as follows: given a dataset of all messages with input set X of inducing factors and corresponding class labels C , on the basis of original features of inducing factors, a series of basic units in our framework are constructed, namely, input unit, evidence processing unit, mass combining unit, fusing unit, and decision unit. Evidence processing unit and fusing unit are, respectively, two smaller multilayer neural networks for handling potential dependence and conflict among evidences and fusing multisource evidences. Finally, all units constitute a larger multilayer neural network as classifier. We aim to learn a classifier to automatically assign class labels for unknown social relations (i.e., test data).

4.2. Basic Units of Framework 4.2.1. Input Unit

Input unit (IU) achieves the mapping process from initial input set to evidence prototype, where evidence prototype is a representative feature set of quantitative intervals of all inducing factors, denoted as E P = { e p 1 , e p 2 , … , e p m n } , where e p i denotes the quantitative interval of the i th feature. e p i is determined by prediction accuracy of trust and distrust in Section 5.2. The power set of evidence prototype with n inducing factors and m quantitative intervals for each inducing factor has m n evidence prototype.

4.2.2. Evidence Processing Unit

Evidence processing unit (EPU) is a smaller multilayer neural network with the form of n - h - 1 for handling potential dependence and conflict among evidences with dynamically adjusting weights in the supervised learning, where n denotes the number of input nodes and h denotes the number of hidden nodes. Each input node corresponds to a focal element of one evidence source. The output node is the processing result of n evidence sources. ω · , j and ω j , · are the connection weight set from input node to hidden node j and from hidden node j to output node, respectively. f ( · ) is the activation function associated with hidden node and output node. At the input-to-hidden and the hidden-to-output layer, the activation function is chosen as logistic sigmoid function, aiming to deal with dependence and conflict between evidence sources through the nonlinear approximation capability of the multilayer neural network. Thus, the activation function of hidden-to-output layer is the processed result of evidence processing unit.

4.2.3. Mass Combining Unit

Mass combining unit (MCU) is composed of 2 c - 1 sum nodes and normalization nodes, where each input node corresponds to the output result of an evidence processing unit; the output node is the normalized combined mass. Due to the requirement of independent sources, we exploit the operation of sum and normalization to achieve Dempster’s combination rule on the basis of all evidence processing units. The combination rule of multiple evidences is defined as (10) ⨁ i = 1 n m i A = ∑ A 1 ∩ ⋯ ∩ A n = A ∏ i = 1 n m A i ∑ A 1 ∩ ⋯ ∩ A n ≠ Ø ∏ i = 1 n m A i .

4.2.4. Fusing Unit

Fusing unit (FU) combines multiple same kind (or level) evidences by using multilayer neural network, which is composed of 2 ( n + 1 ) - 1 evidence processing units and a mass combining unit, where n is the number of evidences. In practice, fusing unit can iteratively combine multiple evidences in a hierarchical form, where the local or low-level evidences are fused at the previous stages and the global or top-level evidences are achieved at the final stages. Each fusing unit is trained separately using the outputs from its previous stages.

4.2.5. Decision Unit

Decision unit (DU) is used to make the final decision based on the maximum of pignistic probability, where input nodes are the normalized masses of the fusing unit, and output nodes are associated with target class labels. In our framework, DU is applied into the training process of local fusing and the final decision of global fusing.

4.3. Framework Architecture

The framework architecture includes an input layer, a fusing layer, and a decision layer, respectively, which is shown in Figure 2.

Figure 2

The framework architecture.

Layer 1: Input Layer. We firstly choose the representative features from each type of inducing factor as evidences. Then, we calculate the distance between input set X and all evidence prototype and select e p i of the minimum distance instead of the initial input set X by using (11) E P i = min j = 1 , … , n m ⁡ X - e p j .

Finally, we establish the Basic Belief Assignment (BBA, or mass function) m i summarizing the information provided by the decreasing function of distance ψ ( · ) and the degree of class membership μ . The monotonically decreasing function ψ ( d i ) of the distance d ( x i , I e p i ) between the i th element x i in input set and the quantitative interval I e p i of the i th feature is defined with an exponential form as (12) ψ d i = exp ⁡ - d x i , I e p i , where the decreasing function ψ ( 0 ) = 1 and lim d → ∞ ⁡ ψ ( d ) = 0 . The distance function d ( x i , I e p i ) is defined as a m a x - x i / a m a x - a m i n , where a m a x and a m i n are the upper bound and lower bound of quantitative interval of the evidence prototype e p i , respectively.

In addition, we focus on the correlation between evidence prototype and class membership. We regard class membership as discriminant degree for pattern classification and assume that each evidence prototype is assigned a value μ q , i which represents a degree of class membership about how much feature prototype e p i belongs to a class c q . The final m i can be written as (13) m i c q = μ q , i · ψ d i , m i Θ = 1 - μ q , i · ψ d i , where the constraint condition of class membership is ∑ q = 1 M μ q , i = 1 . If an evidence prototype fully belongs to a class, it is a special case and is denoted as μ q , i = 1 ; for other class k ≠ q , μ k , i = 0 .

Thus, each mass function is regarded as the output node I i , which corresponds to a focal element of one evidence source. There are n × ( 2 c - 1 ) input nodes in the input layer, where n is the number of evidences and c is the number of classes.

Layer 2: Fusing Layer. We utilize some fusing units to combine multiple evidences in a hierarchical form that begins from local and low-level evidences and then fuses global and top-level evidences. The advantage of fusing unit lies in flexible fusing strategies according to different types of evidences. Therefore, we adopt two strategies of fusing evidences: one is based on the different types of inducing factors; another is according to different types of data attributes, which include network-based and rating-based. The difference of two strategies is that all initial evidences are combined in the local or low-level fusing stage, while both are the same in the global fusing. For the first fusing strategy, three two-source FU modules are applied into combining the same type of inducing factors in local fusing stage. For the second fusing strategy, evidences are fed into two two-source FU for fusing the same type of data attributes in local or low-level fusing stage. The evidence types of two different fusing strategies are shown in Table 2. In global or top-level fusing stage, all evidences obtained from local or low-level fusing stage are combined into the global fusing unit FU simultaneously.

Table 2

The evidence types of two different fusing strategies.

Fusing strategies	Evidence type	Feature name
Inducing factors	HomophilySocial statusEmotion tendency	RS, PCCPolarityRank, PolarityIndegreeET of avg R t r u s t o r , ET of avg R t r u s t e e

Data attributes	Network-basedRating-based	CN, PolarityRankRS, ET of avg R t r u s t o r

The number of units for two different fusing ways is given in Table 3.

Table 3

The number of units for two different fusing strategies.

Fusing strategies	# of units at local or low-level fusing			# of units at global or top-level fusing
Fusing strategies	FU	EPU	MCU	FU	EPU	MCU
Inducing factors	3	21	3	1	15	1
Data attributes	2	14	2	1	7	1

Layer 3: Decision Layer. After fusing the local and global evidences, we obtain three mass functions m ( { t r u s t } ) , m ( { d i s t r u s t } ) , and m ( { t r u s t , d i s t r u s t } ) . Each output node in this layer is associated with a class. Each node D consisted of two incoming masses from fusing layer. The final decision is made by selecting the class with the maximum probability as (14) ω = arg ⁡ max ω n ∈ Θ ⁡ D n .

4.4. Parameter Learning

To apply the proposed framework, the parameter learning is required in each FU. The learning process is achieved in a cascading way. In the local or low-level fusing stage, the FU combined with DU is trained by using corresponding training dataset. In the global or top-level fusing stage, each FU is trained based on the estimated masses, which are transferred from the best outputs of FUs in the local or low-level stage.

Each FU in the fusing layer has similar parameters and learning procedure. Let us take any FU as an example. ω · , j and ω j , · are the connection weight set from input node to hidden node j and from hidden node j to output node, respectively. Each hidden node performs the weighted sum of its inputs to form its net activation, n e t j , which is defined as a formation of inner product of the inputs with the weights at the hidden node: (15) n e t j = ∑ i = 1 d m i ω j i + ω j 0 = ∑ i = i d m i ω j i = w t j m , where m i denotes the i th mass function on the input layer. ω j i denotes the weight of hidden node j of input-to-hidden layer. w and m are the vector formation of both. The weight settings are contributed to improve prediction accuracy and fast converge of iterative algorithm. The initial weights of w are set according to the feature effectiveness of inducing factors in our experiments. Each hidden node emits an output that is a nonlinear function of its activation f ( · ) : (16) y j = f n e t j , where we choose a logistic sigmoid function as nonlinear activation function.

Each output node also calculates its net activation in the hidden-to-output layer: (17) n e t k = ∑ i = 1 h y j ω k j + ω k 0 = ∑ i = i h y j ω k i = w t k y , where k denotes the index of node in the output layer and h denotes the number of hidden nodes, and y j = f ( n e t j ) .

Similar to the hidden node, each output node is calculated by using the nonlinear activation function: (18) z k = f n e t k .

The learning parameters are estimated and adjusted by using a multilayer neural network with least squared error function. The error function is based on minimizing the sum of squared difference between the actual output z k and the desired output t k , which is defined as (19) m i n ω J ω = 1 2 ∑ k = 1 c t k - z k 2 , where ω denotes all the weights in the multilayer neural network. To avoid overfitting, we add two smoothness regularizations on weight ω · , j and weight ω j , · ; then we have (20) m i n ω · , j , ω j , · J ω · , j , ω j , · = 1 2 ∑ k = 1 c t k - z k 2 + α 2 ∑ i ω · , j 2 + β 2 ∑ k ω j , · 2 , where α and β are the regularization parameters which trade off the error function cost with the larger weights penalization.

Then, we derive the back propagation algorithm that is based on gradient descent of error function on the weight ω · , j and ω j , · is given by (21) ω · , j ⟵ ω · , j - η ∂ J ∂ ω · , j - η α ω · , j , ω j , · ⟵ ω j , · - η ∂ J ∂ ω j , · - η β ω j , · , where η is the learning rate and merely determines how much an updating step influences the current weights. The new terms η α ω · , j and η β ω j , · cause the weights to decay in proportion to its size when the weights are updated in the process of iteration.

Next, we turn to the problem of evaluating ∂ J / ∂ ω · , j and ∂ J / ∂ ω j , · on (21). As the operations in MCU and DU are fixed without change, the weight changing is only limited to EPU with multilayer neural network. Thus, we can apply the chain rule to obtain partial derivatives of the error function with respect to the weights ω · , j and ω j , · : (22) ∂ J ∂ ω · , j = ∂ J ∂ z k ∂ z k ∂ n e t k ∂ n e t k ∂ ω · , j = t k - z k f ′ n e t k y j , ∂ J ∂ ω j , · = ∂ J ∂ y j ∂ y j ∂ n e t j ∂ n e t j ∂ ω j , · = x i f ′ n e t j ∑ k ω k j t k - z k f ′ n e t k .

5. Experiments 5.1. Experimental Settings and Evaluation Metric

The experimental settings of trust and distrust prediction are described as follows: we firstly filter some user pairs, such as the similarity that is equal to zero in homophily and users without any helpfulness ratings in emotion tendency. Then, we adopt the Guha methodology [4] and construct a balance dataset with equal numbers of trust and distrust predictions. Finally, let A = { u i , u j ∣ G ( i , j ) = 1 , o r - 1 } be the set of user pairs with trust and distrust. We randomly divide set A into two parts L and N . x % of A denoted as L is chosen as the training dataset. The remaining 1 - x % of A denoted as N is designated for the testing dataset, whose signs of relations are hidden by setting G i , j = 0 .

We follow the common metric to evaluate the effectiveness of inducing factors and the proposed framework. In detail, we take the signs of relations obtained by learning model as the set of predicted results, denoted as R . Then, the prediction accuracy (PA) [2] is defined as N ∩ R / N , where · denotes the size of a set. We conduct all experiments 5 times to ensure that our results are reliable and average the result on the evaluation metric.

5.2. Feature Effectiveness of Inducing Factors

We rank all user pairs in A in a descending order based on feature weights of inducing factors and take the first x % of user pairs as trust relations and take the tail x % of user pairs as distrust relations. x is varied as { 5,10,15,20,25,30,35,40,45,50 } . We compare the predicted results with the original signs of relations to measure the prediction accuracy. For comparison, we set a random guessing program as a baseline method. All the features of inducing factors help to construct mass functions and initialize the weights of hidden nodes in local fusing state. The evaluated results are illustrated in Figures 3–5, respectively. We draw the following observations.

Figure 3

Prediction accuracy of trust and distrust about homophily.

(a)

Trust relations

(b)

Distrust relations

Figure 4

Prediction accuracy of trust and distrust about social status.

(a)

Trust relations

(b)

Distrust relations

Figure 5

Prediction accuracy of trust and distrust relations about emotion tendency.

(a)

Trust relations

(b)

Distrust relations

( 1) Homophily. In Figures 3(a) and 3(b), RS obtains the best performance among the four measures of homophily coefficient. We also note that RS, PCC always obtain better performance than CN, JC, and random. It suggests that helpfulness rating is more effective than network topology in predicting social relations. Meanwhile, since those user pairs with distrust relations also have a certain degree of similarity, the performance of trust prediction is superior to distrust prediction for homophily.

( 2) Social Status. In Figures 4(a) and 4(b), the effectiveness of predicting trust relations is worse than distrust relations for social status. The main reason is that users with high statuses are more active; the number of their constructing trust and distrust relations is greatly more than that of users with low statuses. We also note that the performance of P o l a r i t y R a n k and P o l a r i t y I n d e g r e e is very similar, which is much better than that of P a g e R a n k .

( 3) Emotion Tendency. In Figures 5(a) and 5(b), since users are likely to give positive ratings (the majority of ratings are with 4 or 5 stars), predicting trust relations based on emotion tendency leads to a bit of confusion and has a certain degree of error rate. However, negative emotion tendency is obviously more discriminated than positive one; the effectiveness of predicting distrust relations is relatively clear. In addition, the performance of ET of a v g R t r u s t o r is better than ET of a v g R t r u s t e e in terms of trust and distrust prediction.

On the whole, with the increase of x , feature effectiveness of inducing factors reduces, however, still much better than that of random guessing. It is consistent with the previous observations that the task of predicting trust and distrust is more obvious at first and tail region; on the contrary, trust and distrust are confused and difficult at middle region. In addition to comparing the effectiveness of different inducing factors, our other objective is to discrete features of inducing factors and to determine the quantitative interval of evidence prototype in our proposed framework. We stratify features of inducing factors into m levels according to the acceptable results of prediction accuracy. In our experiments, we choose m = 3 for all cases and divide feature strength into three levels, such as high, medium, and low, respectively. The discretization result of feature prototype is shown in Figure 6, if the acceptable prediction accuracy is assigned as 80%, which we can get the order of feature effectiveness. For trust relations, the order of feature effectiveness is ET of a v g R t r u s t o r , ET of a v g R t r u s t e e , RS, PCC, P o l a r i t y R a n k , P o l a r i t y I n d e g r e e , and so on. For distrust relations, ET of a v g R t r u s t o r , ET of a v g R t r u s t e e , P o l a r i t y R a n k , and P o l a r i t y I n d e g r e e are several more effect features. In local fusing state, the weight settings of hidden nodes in EPU are based on the rank of feature effectiveness.

Figure 6

The discretization result of evidence prototype.

5.3. Impact in Different Types of Evidences

On the basis of analyzing all features effectiveness, we use the logistic regression method of LIBLINEAR (http://www.csie.ntu.edu.tw/~cjlin/liblinear/) to evaluate the impact in different types of evidences. The evaluated results help to initialize weight settings of hidden nodes in global fusing state. In detail, we compare two fusing ways to evaluate the impact in different types of evidences: one way is based on the different types of inducing factors; another one is based on the different types of data sources, such as network structure and interaction behaviors. In two groups of experiments, we vary the percentage of labeled relations from 10% to 50% for training.

Table 4 lists the classification performance on different inducing factors. We can see that emotion tendency achieves the highest performance, which is 0.845 at the 30% labeled relations and gradually stabilizes. The performance on homophily is even slightly worse than that of other methods. It is consistent with our previous observations that homophily does not have good discriminating ability regarding trust and distrust.

Table 4

Classification performance of different types of inducing factors.

Type of inducing factor	Percentage of labeled relations
Type of inducing factor	10%	20%	30%	40%	50%
Homophily	0.542	0.652	0.705	0.746	0.745
Social status	0.585	0.702	0.726	0.766	0.764
Emotion tendency	0.679	0.785	0.845	0.84	0.843

We, respectively, integrate the features of CN and PolarityRank into network structure and the features of RS and emotion tendency into interaction behaviors. The classification performance of two different types of data sources is shown in Table 5. We can see that the classification method achieves better performance when using interaction behaviors compared to network structure data in terms of prediction accuracy. Specifically, the performance with interaction behaviors improves the prediction accuracy by 0.05–0.12 compared to the network structure. In addition, from both Tables 4 and 5, the fusing way based on inducing factors outperforms the fusing way of data sources.

Table 5

Classification performance of different types of data sources.

Type of data source	Percentage of labeled relations
Type of data source	10%	20%	30%	40%	50%
Network structure	0.537	0.650	0.721	0.763	0.760
Interaction behavior	0.652	0.724	0.813	0.810	0.812

5.4. Performance of the Proposed Framework

In order to demonstrate the effectiveness of the proposed framework, we compare with several baseline methods: (i)

Random: a random guessing result yields 50% prediction accuracy, since trust and distrust in dataset are balanced.

(ii)

SVM: we apply SVM classifier on the features of inducing factors, which is similar to the logistic regression model.

(iii)

DT: C5.0 decision tree is constructed by exploiting information gain on features of inducing factors.

(iv)

NN: we regard neural network with multilayer as the multisources data fusion tool, where input nodes are corresponding to feature of inducing factors.

(v)

DS: previous work [26] applies Dempster-Shafer theory on Epinions to predict trust and distrust, which only includes rating data, without a web of trust.

(vi)

CDN: our proposed approach combines Dempster-Shafer theory and neural network to predict trust and distrust.

As the sparsity of social relations and interaction data, we only use 40% of the labeled data for training. Table 6 shows the performance comparison of the different classification methods. Our approach, CDN, gives 0.067–0.113 improvements on prediction accuracy over DT and SVM methods. When all features of inducing factors are integrated into the SVM classifier, there is no significant improvement of performance compared with the previous logistic regression. In addition, since feature ranking technique is applied into the C5.0 decision tree, the performance of DT is better than SVM (+0.046).

Table 6

Performance comparison of different classification methods.

Evaluation metric	Classifier technique
Evaluation metric	Random	SVM	DT	CDN
PA	0.50	0.801	0.847	0.914

Table 7 lists the performance comparison of the different methods for data fusion technique. It is clear that our proposed framework CDN significantly outperforms the other baseline methods. Specifically, CDN improves the prediction accuracy by 0.045–0.079 compared with the NN and DS methods. From both Tables 6 and 7, we can arrive at a conclusion that our approach can clearly improve performance on predicting trust and distrust.

Table 7

Performance comparison of different fusion methods.

Evaluation metric	Data fusion methods
Evaluation metric	NN	DS	CDN
PA	0.869	0.835	0.914

6. Conclusion and Future Work

In this paper, we firstly analyze and quantify inducing factors about trust and distrust and verify the research motivations that these inducing factors are contributed to predict trust and distrust relations. Then, we devise the evidence prototype based on these inducing factors, which improve the reliability of evidence features and simplify the complexity of establishing the Basic Belief Assignment in Dempster-Shafer theory. Finally, we propose a framework to predict trust and distrust based on combining Dempster-Shafer theory and neural network, which investigate a multilayer neural network to model the implementing process of Dempster-Shafer theory of evidence in different hidden layers, aiming to overcome the disadvantage of Dempster-Shafer theory without optimization method. Experimental results on a real-world dataset demonstrate the effectiveness of the proposed framework.

There are several directions needing further investigation. Firstly, we will incorporate our framework to other social media applications, such as tie strength prediction, sign prediction, and recommendation system. Secondly, we plan to explore new models and algorithms on predicting trust and distrust, such as sparse learning and deep learning.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant no. 61300148, the Science and Technology Development Program of Jilin Province of China under Grant no. 20130522112JH, Guangxi Key Laboratory of Trusted Software no. kx201533, Scientific and Technological Planning Project of Jilin Province (20140414064GH), Scientific Research Fund of Jilin Provincial Education (20140307), and Jilin Province Development and Reform Commission Projects (2013C040).

Dubois

Golbeck

Srinivasan

Predicting trust and distrust in social networks

Proceedings of the IEEE 3th Inernational Conference on Social Computing (SocialCom '11)

October 2011

Boston, Mass, USA

418 424

Leskovec

Huttenlocher

D. P.

Kleinberg

J. M.

Predicting positive and negative links in online social networks

Proceedings of the 19th International World Wide Web Conference (WWW '10)

April 2010

Raleigh, NC, USA

641 650

10.1145/1772690.1772756

2-s2.0-77954580498

Shafer

A mathematical theory of evidence

Technometrics 1978 20 1 1 242

Guha

Kumar

Raghavan

Tomkins

Propagation of trust and distrust

Proceedings of the 13th international conference on World Wide Web (WWW '04)

May 2004

New York, NY, USA

403 412

10.1145/988672.988727

Kim

Y. A.

Song

H. S.

Strategies for predicting local trust based on trust propagation in social networks

Knowledge-Based Systems 2011 24 8 1360 1371

10.1016/j.knosys.2011.06.009

2-s2.0-80051470135

H.-K.

Kim

J.-W.

Kim

S.-W.

Lee

A probability-based trust prediction model using trust-message passing

Proceedings of the 22nd International Conference on World Wide Web (WWW '13)

May 2013

Rio de Janeiro, Brazil

161 162

2-s2.0-84893052524

Zolfaghar

Aghaie

A syntactical approach for interpersonal trust prediction in social web applications: combining contextual and structural data

Knowledge-Based Systems 2012 26 93 102

10.1016/j.knosys.2010.10.007

2-s2.0-84155189040

Lim

E.-P.

Nguyen

V.-A.

Sun

Liu

Trust relationship prediction using online product review data

Proceedings of the 1st ACM International Workshop on Complex Networks Meet Information & Knowledge Management (CNIKM '09)

November 2009

Hong Kong

ACM

47 54

10.1145/1651274.1651284

Liu

Lim

E.-P.

Lauw

H. W.

M.-T.

Sun

Srivastava

Kim

Y. A.

Predicting trusts among users of online communities: an epinions case study

Proceedings of the 9th ACM Conference on Electronic Commerce (EC '08)

June 2008

Chicago, Ill, USA

ACM

310 319

10.1145/1386790.1386838

2-s2.0-70350686352

Tang

Chang

Aggarwal

Liu

Negative link prediction in social media

Proceedings of the 8th ACM International Conference on Web Search and Data Mining (WSDM '15)

Feburary 2015

Shanghai, China

87 96

10.1145/2684822.2685295

Leskovec

Huttenlocher

D. P.

Kleinberg

J. M.

Signed networks in social media

Proceedings of the 28th International Conference on Human Factors in Computing Systems (CHI '10)

April 2010

Atlanta, Ga, USA

1361 1370

10.1145/1753326.1753532

2-s2.0-77954012379

Yang

S.-H.

Smola

A. J.

Long

Zha

Chang

Friend or frenemy?: predicting signed ties in social networks

Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '12)

August 2012

Portland, Ore, USA

ACM

555 564

10.1145/2348283.2348359

2-s2.0-84866612304

Hsieh

C.-J.

Chiang

K.-Y.

Dhillon

I. S.

Low rank modeling of signed networks

Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12)

August 2012

Beijing, China

ACM

507 515

10.1145/2339530.2339612

2-s2.0-84866010075

Agrawal

Garg

V. K.

Narayanam

Link label prediction in signed social networks

Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI '13)

August 2013

Beijing, China

2591 2597

2-s2.0-84896061815

Xiao

Yang

Pang

Dang

The prediction for listed companies' financial distress by using multiple prediction methods with rough set and Dempster-Shafer evidence theory

Knowledge-Based Systems 2012 26 196 206

10.1016/j.knosys.2011.08.001

2-s2.0-84155189126

Dymova

Sevastjanov

An interpretation of intuitionistic fuzzy sets in terms of evidence theory: decision making aspect

Knowledge-Based Systems 2010 23 8 772 782

10.1016/j.knosys.2010.04.014

2-s2.0-77956341226

Sikder

I. U.

Gangopadhyay

Managing uncertainty in location services using rough set and evidence theory

Expert Systems with Applications 2007 32 2 386 396

10.1016/j.eswa.2005.12.015

2-s2.0-33750492251

Yen

Generalizing the Dempster-Shafer theory to fuzzy sets

IEEE Transactions on Systems, Man, and Cybernetics 1990 20 3 559 570

10.1109/21.57269

MR1059561

2-s2.0-0025434842

Basir

O. A.

Karray

Zhu

Connectionist-based Dempster-Shafer evidential reasoning for data fusion

IEEE Transactions on Neural Networks 2005 16 6 1513 1530

10.1109/tnn.2005.853337

2-s2.0-28244456444

Denoeux

A neural network classifier based on Dempster-Shafer theory

IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans. 2000 30 2 131 150

10.1109/3468.833094

2-s2.0-0033902309

Denoeux

A k-nearest neighbor classification rule based on Dempster-Shafer theory

IEEE Transactions on Systems, Man and Cybernetics 1995 25 5 804 813

10.1109/21.376493

2-s2.0-0029307876

Mcpherson

Smithlovin

Cook

J. M.

Birds of a feather: homophily in social networks

Annual Review of Sociology 2001 15 4 344 349

10.1146/annurev.soc.27.1.415

2-s2.0-0035639140

Tang

Gao

Liu

Exploiting homophily effect for trust prediction

Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM '13)

February 2013

Rome, Italy

IEEE

53 62

10.1145/2433396.2433405

2-s2.0-84874223723

Salton

Mcgill

M. J.

Introduction to Modern Information Retrieval 1983

New York, NY, USA

McGraw-Hill

Computer Series

Cruz

F. L.

Vallejo

C. G.

Enríquez

Troyano

J. A.

PolarityRank: finding an equilibrium between followers and contraries in a network

Information Processing and Management 2012 48 2 271 282

10.1016/j.ipm.2011.08.003

2-s2.0-84857364719

Kim

Y. A.

Ahmad

M. A.

Trust, distrust and lack of confidence of users in online social media-sharing communities

Knowledge-Based Systems 2013 37 438 450

10.1016/j.knosys.2012.09.002

2-s2.0-84870067200