A Hybrid Recommender System for Gaussian Mixture Model and Enhanced Social Matrix Factorization Technology Based on Multiple Interests

Recommender systems are recently becoming more significant in the age of rapid development of the information technology and pervasive computing to provide e-commerce users’ appropriate items. In recent years, various model-based and neighborbased approaches have been proposed, which improve the accuracy of recommendation to some extent.However, these approaches are less accurate than expected when users’ ratings on items are very sparse in comparison with the huge number of users and items in the user-item rating matrix. Data sparsity and high dimensionality in recommender systems have negatively affected the performance of recommendation. To solve these problems, we propose a hybrid recommendation approach and framework using Gaussian mixture model and matrix factorization technology. Specifically, the improved cosine similarity formula is first used to get users’ neighbors, and initial ratings on unrated items are predicted. Second, users’ ratings on items are converted into users’ preferences on items’ attributes to reduce the problem of data sparsity. Again, the obtained user-item-attribute preference data is trained through theGaussianmixturemodel to classify userswith the same interests into the same group. Finally, an enhanced social matrix factorization method fusing user’s and item’s social relationships is proposed to predict the other unseen ratings. Extensive experiments on two real-world datasets are conducted and the results are compared with the existing major recommendation models. Experimental results demonstrate that the proposedmethod achieves the better performance compared to other techniques in accuracy.


Introduction
Collaborative filtering (CF) is one of the most mature and widely used recommendation algorithms, which can be divided into model-based methods and memory-based methods [1,2].The former uses machine learning methods to model users' preferences by training data and predicts unknown ratings through the trained model.The latter is a neighbor-based approach, which calculates the similarities between users/items to get neighbor users with similar interests to an active user or neighbor items with similar characteristics to an active item, so as to predict and recommend the interest items for the active user.
According to recommendation strategies, recommendation algorithms can be divided into CF recommendation algorithm, content-based filtering, knowledge-based filtering, association rule-based and graph-based recommendation algorithm, etc. [3,4].Due to the shortcomings of a single algorithm in the recommendation performance, such as coldstart problem of CF algorithm and the lack of diversity of recommended resources in the content-based recommender system (RS) [5,6], the researchers begin to merge together a variety of algorithms to improve the recommendation performance.Hernando et al. [7] introduce the reliability measure into prediction based on CF, to improve the prediction information provided to users.Chen et al. [8] propose a 2 Mathematical Problems in Engineering hybrid model, which combines Gaussian mixture model with item-based CF recommendation algorithm and predicts the ratings on items from users to improve the recommendation accuracy.Luo et al. [9] propose a hierarchical Bayesian model-based CF and related inference algorithm, which reduces the prediction errors.Hofmann et al. [10] propose an algorithm of latent semantic models for CF, which is a usercentric view technique and introduces probabilistic latent semantic analysis (pLSA) into user modeling process, and it achieves higher prediction accuracies.
In recent years, some researchers introduced the context information into RS to improve the accuracy of the recommendation algorithms.Massa et al. [11] propose a trust-aware collaborative filtering algorithm, which increases the coverage of recommender systems with preserving the quality of predictions.Azadjalal et al. [12] employ trust relationships into recommendation algorithms and propose a method to identify implicit trust statements by applying a specific reliability measure.The proposed algorithm outperforms the traditional recommenders in accuracy and coverage measures.Gao et al. [4] propose a method to obtain context-aware preference based on cognitive behaviors, which elicits user preferences under multidimensional context environments by establishing the cognitive factors' mutual effect model, to achieve better prediction accuracy.
The above methods improve the accuracy of the recommendation to some extent.However, the user-item rating matrix, namely, one of the inputs to the recommendation algorithm for large amounts of data, is often highly sparse, which leads to unreliable predictions [13][14][15][16][17].The main reasons are as follows: (1) excessive sparsity leads to the lack of common rating items, so neighbors cannot be selected to make predictions; (2) due to too few neighbors and common rating items, similarity is not accuracy; thus the quality of recommendation is difficult to guarantee; (3) a variety of recommendation algorithms proposed cannot predict the ratings or make the recommendations for all the items; (4) matrix factorization algorithms can predict ratings for all the items, but the accuracy did not reach the desired effect on the basis of data sparsity.
To overcome the above problems, a hybrid recommendation method that combines the Gaussian mixture model with an enhanced social matrix factorization algorithm is proposed in this paper.To reduce the sparsity and predict unknown ratings more accurately and completely, we first fill in the rating matrix from the perspectives of users, items, and users' interests so that an accurate recommendation can be made for users to achieve users' satisfaction.Specifically, CF and Gaussian mixture model are used to fill in the user-item rating matrix, and then matrix factorization technique is used to predict all unknown ratings.Firstly, we make a preliminary prediction using the proposed improved similarity method to reduce the data sparsity.Secondly, in order to classify users with the same interests into same group, ratings on items from users are converted into the users' preferences on items' attributes.Based on the assumption that each user has multiple interests, the users are divided into different groups according to different interests by using the Gaussian mixture model.Then the partial unknown ratings are predicted using the probability that each user belongs to each different group.Finally, all unrated items are predicted by using the proposed enhanced social matrix factorization.
The remainder of this paper is organized as follows: Related studies are reviewed in Section 2. A hybrid recommendation model and framework are proposed in Section 3. In Section 4 several experiments are conducted and evaluated.In Section 5, we draw our conclusions.

Related Work
CF-based RS recommends items to an active user based on the opinions of his/her like-minded neighbors [24].According to modeling methods and recommended strategies of CF recommendation methods, their methods can be classified into two categories of neighbor-and model-based methods.For neighbor-based CF, all the ratings provided by users are kept in memory and used for prediction.To calculate the similarity between users/items, all the previously rated items are considered.For model-based CF, a model based on training data is obtained and the model is used to predict unknown ratings for real data.

Neighbor-Based Collaborative
Filtering.The neighborbased CF algorithm can be divided into two subcategories: user-based CF and item-based CF [25,26].The execution process of the neighbor-based CF recommendation algorithm can be divided into the following three steps [27,28]: (a) calculate the similarity between an active user (or item) and other users (or items) through ratings on items from users; (b) select nearest neighbors for the active user (or item) according to the obtained similarity; (c) predict the ratings on the candidate items from the active user according to historical preference information of nearest neighbors, so as to produce recommended results.The rationales of userbased and item-based CF algorithms are shown in Figure 1.

User-Based CF Recommendation Algorithm.
In general, the users and items are denotes as the vectors u = {u 1 , u 2 , ..., u N } and i = {i 1 i 2 , . . ., i M }, respectively.The ratings on items from users are usually expressed as a rating matrix, and the ratings on items from a user can be denoted as a rating vector R u = {R u1 , R u2 , . . ., R uN }.For instance, the rating on item i 3 from user u 1 is 4 in Figure 1, which is denoted as R 13 = 4.The similarity between two users is obtained by comparing the rating vectors of the two users.In RS, one of the most frequently used measures to calculate the similarities between users is cosine similarity.
Cosine similarity: the ratings on items from user u and user v are described as n-dimensional vectors, respectively, and the similarity between user u and user v is obtained through comparing the angles of the two rating vectors.The smaller the angle is, the higher the similarity value is.Cosine vector similarity measure is calculated as follows [24] (see (1)):  According to [2,24,28], the predicted rating on item i form the active user u using the incremental weighted average is described as follows [2] (see (2)): where Rui represents the predicted rating on item i from user u, N u is the set of users that are most similar to active user u, R u and R v represent the average ratings on all rated items from user u and v, respectively, and sim uv indicates the similarity between active user u and neighbor user v according to (1).

Item-Based CF Recommendation
Algorithm.Similar to user-based CF recommendation algorithm, item-based CF recommendation algorithm predicts unknown ratings by using rated ratings on neighbor items from the active user.
Pearson correlation coefficient (PCC) is one of the most frequently used measures to calculate the similarities between items, which is described as follows [14] (see (3)): where sim ij denotes the similarity between item i and item j, R ui and R uj represent user u's rating vectors for items i and j, respectively, and R i and R j represent the average ratings on items i and j in C ij , respectively.C ij denotes the set of users who rated both items i and j.Ratings are predicted as follows [2] (see ( 4)): where N i is the set of items that are most similar to item i.

Model-Based CF Recommendation Technology.
A recommendation algorithm should scale gracefully with the increase data.For the conventional neighbor-based CF methods, all the previously rated items are considered to compute the similarity between uses/items, it is time-consuming, and the methods are fail to achieve good scalability [18].In order to reduce the computational complexity and improve the efficiency of recommendation without reducing the recommendation accuracy, subsequent clustering-based and classification-based techniques are introduced into RS to generate and evolve a variety of model-based recommendation algorithms, such as clustering models [29][30][31][32][33], singular value decomposition-(SVD-) based models [15,18], and probabilistic matrix factorization-(PMF-) based models [34][35][36][37].Model-based methods initially train a model based on training data to find patterns and then makes predictions for real data [38,39].

Gaussian Mixture Model.
Clustering analysis is an unsupervised classification approach for recognizing patterns, which is based on grouping similar objects together.In RS, similar users/items are classified into the same category by using clustering techniques, and then only the ratings of neighbor users/items in the same category are calculated to predict unrated items, which greatly reduce the computational effort.Generally, clustering is divided into hard and soft clustering.K-means is an important and well known hard clustering technique.Each object belongs to exactly one category in hard clustering, and there is no uncertainty in category membership of an object [29,30].In soft clustering, each object belongs to two or more categories with different degree of membership, rather than fully belonging one of the categories [31].Gaussian mixture model (GMM) is a very well-known soft clustering technique.GMM is used in RS to more accurately discover a user's multiple interests and degree of preference on an item to better recommend items of interest to the user [8].GMM assumes that the data obeys the mixture Gaussian distribution [40].In other words, the data can be thought of as being generated from several Gaussian distributions.Each GMM consists of k Gaussian distributions.Among them, each distribution is called a "component", and these components are linearly added together to form the probability density function of GMM.
For a random vector x in n-dimensional sample space , if x obeys Gaussian distribution, the probability density function [34] is as follows (see (5)): where  is an n-dimensional mean vector and Σ is a covariance matrix of n × n.It can be seen that the Gaussian distribution is completely determined by two parameters of the mean vector  and the covariance matrix Σ.
Gaussian mixture distribution consists of k mixing components, each of which corresponds to a Gaussian distribution [37].The mixture distribution is defined as follows (see (6)): where  i and Σ i are the parameters of the ith Gaussian mixture and  i is the corresponding mixing coefficient.Here, ∑ k i=1  i = 1 and  i > 0.
Latent Factor Model.The essence of matrix factorization is decomposing a very sparse rating matrix into two matrices: one represents the characteristics of the user and the other represents the features of the item [2].The inner product of each row and each column of two matrices yields the corresponding rating.A user-item rating matrix denoted by R N×M can be decomposed into the product of two matrices U N×f and V f×M , which is expressed as follows (see (7)): where U N×f represents the relationships between N users and f topics and V f×M represents the relationships between f topics and M items.f indicates the dimensions of the latent factors.
In RS, the recommendation algorithm based on matrix factorization is implemented in two processes: (1) The original matrix is decomposed into two low-rank matrices by using known ratings in (7).The decomposition of the original matrix is the process of model training, and model parameters are usually obtained by using the stochastic gradient descent method for the established loss function.
(2) Predict the unknown ratings by using the inner product of the obtained low-rank matrices U and V, i.e., RN×M = U N×f × V f×M .
Therefore, the key task of recommendation algorithm based on matrix factorization is to solve U N×f and V f×M , which can be transformed into a regression problem.A loss function, namely, L, which represents the sum of squares of the errors between the original ratings and the predicted ratings, is defined as follows [1,2,21] (see (8)): where e ij = R ij − Rij .U ik represents the kth feature of user i and V kj represents the kth feature of item j.The loss function after adding the regularization term is defined as follows (see (9)): In general, two regularization terms  U and  V are set to be the same for ease of calculation, i.e.,  =  U =  V .Then the stochastic gradient descent method is used to solve (9) (see ( 10)-( 11)): Update variable U  ik and V  kj according to the negative gradient as follows (see ( 12)-( 13)): where  denotes learning rate,  is the regularization term parameter, and e ij represents the difference between the real rating and the predicted rating.
Singular Value Decomposition.SVD is a powerful computation tool in latent semantic analysis which results in downsizing of relevant variables and finding a good approximation of R [15].The purpose of SVD in RS is to compress a very high dimensional data into a low-dimensional space, while preserving the major information content.A matrix R N×M with the rank of r ≤ min(N, M) can be decomposed as follows [2,15] (see (14)): where and items' features, namely, left-singular vectors and rightsingular vectors, respectively, and their values correspond the eigenvectors of RR T and R T R. Σ is a r × r diagonal matrix, which represents the degree of association between P and Q.The diagonal values, namely, the singular values are arranged from large to small, each of which is indeed the square root of the eigenvalue of RR T (or R T R).Moreover, the eigenvectors of RR T and R T R, i.e., rows of P and Q, are arranged according to the eigenvalues.
To benefit from dimensionality reduction, we put f < r and include only the f largest singular values in Σ and replace the others by zero.Therefore, R is described approximately with R as follows (see (15)): where Σ is the f-rank approximation of Σ.The matrices P N×f and Q M×f , respectively, contain the f largest eigenvectors of RR T and R T R, whereas the diagonal matrix Σf×f contains the nonnegative square roots of the f largest eigenvalues of either matrix along its diagonal [20,46].For finding the positions of users and items, we can map raw data to the p dimensions space as follows [17] (see ( 16)-( 17)): where P Trans and Q Trans are new positions of users and items in the f dimensions space, respectively.For instance, the matrix in Figure 1(a) is denoted as A, which can be decomposed into P, Q, and Σ.The approximation of A can be obtained by taking the first 2-dimensional data, i.e., P  = ( ), Σ  = ( 11.65 0 0 5.767 ).P  and Q  are projected in 2-dimensional space and plotted in Figure 2.
When a new user p new with a shared rating of [5, 3, 4, 0, 1, 2] arrives, the following calculation is performed to find the position of the new user in the 2-dimensional space as follows: As can be seen in Figure 2, it can be found that the user p 1 is close to the new user.Therefore, p 1 is considered to be the nearest neighbor of the new user, and the similarity of the approximate low-dimensional data can be calculated to make a recommendation.
Classic matrix factorization [19,41] methods can provide very accurate predictions and have the advantages of high computational efficiency and expansibility.However, classic matrix factorization algorithms would make unreliable predictions when encountering new users/items and extremely sparse data.To solve the problems mentioned above, social relationships between users are introduced into RS, and trustbased CF (TCF) [23,37,45,49,50] algorithms arise in recent years.TCF algorithms integrate trust relationships between users into recommendation algorithms to improve the quality of the recommendation [11,13,18,24,29,30].These methods are also divided into neighbor-based TCF (NTCF) and model-based TCF (MTCF).NTCF is not scalable, so the computational cost is extremely high when faced with large amounts of data.Therefore, NTCF is not suitable when dealing with a great number of users and items.MTCF incorporates social network information into the matrix factorization model, to optimize the parameters by using the user-item ratings and individual trust among users.Thus it is scalable and also makes an accurate recommendation [15,18,37,41,49,50].Therefore, MTCF has been drawing considerable attention recently [19,51].Many of the current recommendation methods are based on the idea of MTCF.

The Proposed Hybrid Recommendation Method and Framework
In this section, an overview of the proposed hybrid recommendation method is described in Section 3.1.Second, an improved CF recommendation method is described in Section 3.2 and predicting the initial ratings using EM algorithm is discussed in Section 3.3.Third, the calculation method of user's social status in trust networks is discussed and an enhanced social matrix factorization model is proposed in Section 3.4.Finally, computational complexity of the hybrid method in this paper is analyzed in Section 3.5.

Overview.
In the case of sparse data, in order to improve the effect of user clustering and the accuracy of predicted ratings, we present a hybrid recommendation method based on Gaussian mixture model and enhanced social matrix factorization algorithm (GMMESMF) as Figure 3.The prediction process is described as follows: (1) The similarities of users and items are calculated using the improved user-based and item-based measures.Here, the user's similarity calculation method takes into account factors such as the trust relationships between users, the number of users' common interactions with the same item, and the users' evaluation times for the items, to reduce the influence  of data sparsity on the performance of CF algorithm.Then partial unrated items are predicted using user-based and item-based CF methods.(2) The users' preferences on the items are converted into the users' preferences on items' attributes.Then, the users are clustered using the Gaussian mixture model.In this way, the users in the same cluster have the same interests.This can speed up the search for nearest neighbors and increase the speed of recommendation generation.(3) We then predict some unrated items using expectation maximization (EM) algorithm to calculate the probability that each user belongs to each cluster.(4) After filling the user-item rating matrix in the first two steps, we can obtain a matrix that is not very sparse and reliable.The trust relationships between users and trust propagation are introduced into matrix factorization model, and an enhanced social matrix factorization technique is proposed to learn more accurate user characteristics and item characteristics in each cluster, so that the prediction for unrated items is completed; meanwhile, the prediction accuracy can be greatly improved.The density of the user-item rating matrix of each cluster is usually denser than the original user-item rating matrix when the number of users in each cluster is relative small.(5) According to the result of the prediction, the rating results are sorted in descending order, and the candidate items are recommended to the active user.
Through the above step-by-step filling method, partial unrated items can be filled according to the user-based and item-based collaborative filtering algorithms using their neighbor relationships.If the unknown ratings are predicted by matrix decomposition techniques directly, the predicted ratings would be inaccurate due to too few rated items for reference.To obtain reliable ratings, we estimate the reliability of filling ratings by using the reliability measure and then predict all unknown ratings using the matrix factorization algorithm.

Fill the Rating Matrix by Improved Collaborative Filtering
Recommendation Algorithm.CF algorithm collects users' preferences from the user-item rating matrix and produces the recommendations only based on the opinions that users' interests are similar to current active user.However, the rating matrix is usually very sparse, when there are few neighbor users highly related to the active user, the quality of predicted ratings would be seriously affected, resulting in poor quality recommendation [28].However, CF is one of the most successful and effective recommendation algorithms.Therefore, we use CF algorithm to predict some unrated items to reduce the sparsity of the rating matrix.
Cosine similarity and PCC are common methods of calculating similarity between two users and between two items, respectively, which directly affect the prediction accuracy.When two rating vectors given by two users are equal ratio, such as (4, 4, 4) and (5,5,5), (1, 2, 1), and (2.5, 5, 2.5), as Table 1, the similarities calculated by cosine and PCC will be both 1.0, although the aforementioned rating pairs are quite different.
In order to avoid the above problem and improve the similarity between the users or the items, inspired by where MURC represents the center point of user preference and C and O represent the sets of all users and all items, respectively.Card(C) and Card(O) represent the numbers of C and O, respectively.

Definition 2. Average Similarity Standard Deviation (ASSD).
The average similarity standard deviation refers to the distance of the similarity values of all the other users from the mean user reference center.It reflects the average distribution of the average ratings on all items from each user.
where R u denotes the average rating on all items from user u.
Definition 3 (Basic Similarity Region (BSR)).The basic similarity region is described as the user preference similarity distribution range whose center is the mean user reference center and whose radius is the average similarity standard deviation.If sim uv [MURC − ASSD, MURC + ASSD], then sim uv ∈ BSR; otherwise, sim uv ∉ BSR.BSR denotes the basic similarity region.sim uv is calculated in the coordinate of the dimensionality reduction by (16).
Inspired by [4,14,26,27,53], we incorporate salton factors and trust relationships into cosine similarity calculation between users.The improved similarity calculation method is as follows (see (21)): This will determine whether the users' preference behaviors need to be revised.Here sim uv represents the similarity between u and v, which are filtered out the noisy data.When the similarity between u and v is not in BSR, the similarity is considered to be a noisy data.Cosine similarity in (1) only considers the angle between two vectors but does not consider the length of the vector, so it needs to be revised according to (21).
Here, w 1 and w 2 represent the salton and time attenuation factors, respectively.Among them, w 1 = (|O uv | + )/(|O ũv | + ), and O uv denotes the set of common items rated by both u and v. O ũv denotes the set of items rated by any other two users. is a constant parameter, which aims at avoiding the case where the denominator in the formula is zero or w 1 is zero.The difference between u and v's rating times for the same item means that their interest changes are not synchronized.Therefore, it is necessary to introduce the time attenuation factor w 2 to weight the similarity between u and v; i.e., it is necessary to reduce the similarity when the rating times of two users are far apart.w 2 = (∑ i∈O uv (1/(1 + exp(|t ui − t vi |))))/|O uv |, where  denotes the parameter of time attenuation and t ui and t vi denote the times of the item i rated by users u and v, respectively.
The improved similarity between user u and user v is described as follows (see (22)): where T uv denotes the similarity between users u and v, which can be calculated as follows [11,30] (see ( 23)): where d max denotes the maximum allowable propagation distance among users and d uv denotes the trust statement between users u and v.In this section, the dimensions of data first are reduced as( 16) and ( 17) before calculating the similarity.To ensure the accuracy of the predictions, a reliability measure in [7,24,55] is employed to evaluate the quality by providing a feedback on the quality of the predicted rates.The reliability measure is calculated as follows [7,24] (see (24)): where R ui is the reliability of a prediction Rui .K ui is the number of neighbors of u who have rated the item i. D ui is the variance of the ratings made by the neighbors of u over The reliable preferences on the item i.The value range of R ui is [0, 1].The larger the value of R ui is, the higher the accuracy of the prediction is, and vice versa.If the reliability value of R ui is less than a threshold value  ( = 0.85 in this paper), the prediction rating rui would be set to 0.

Covert Ratings on Items from Users into Preferences on
Items' Attributions from Users.In the case of very sparse useritem rating matrix, in order to form a group of users with the same interest, ratings on items from users are converted into users' preferences on items' attributes, and then the attributes are clustered to get the users' interest groups by using Gaussian mixture model, and finally unknown ratings are predicted by using the neighbor-based method.Definition 4. A user's preferences on an item's attributes reflect the user's interests on the item.The relation matrix between the users and the attributes of the items can be expressed as the product between the user-item rating matrix and the item' attribute matrix.
where R (ua) , R (ui) , and R (ia) denote the user-item's attribute matrix, the user-item rating matrix, and the item's attribute matrix, respectively.⋅ denotes the product of the two matrices, but it is not simply a multiplication of two matrices.
If a user likes an item, it is because the user is interested in the attributes of the item.Under this assumption, we can obtain the preferences on items from users by using the ratings in the user-item rating matrix and item-attribute information in item-attribute matrix.Specifically, the preference of the user u on the attribute a of the item i can be described as follows (see (26)): where p ua denotes the preference of user u on attribute a, R ui denotes the rating of the user on the item i, and A ia denotes the attribute a of the item i.Here, A ia is defined as follows (see (27)): 1, i has the attribute of a.
The transition from the user-item ratings to the useritem's attribute preferences are showed in Figure 4.For instance, the rating R 11 on i 1 from u 1 is 3 and the item i 1 has the attributes of a 2 , a 3 , and a 6 .Thus, the preference values on a 2 , a 3 , and a 6 from u 1 are all 3.
In general, the more a user's ratings on an item's attribute is, the more reliable his/her evaluation on the item's attributes is.The lower the preference value is, the less reliable the user's attribute rating on the item is.Therefore, these ratings can be removed.Considering the reliability of the user's preferences on the item's attributes, the preference values against each item are weighted average as follows (see (28)): where p  ua denotes the reliable preference of the user u on the attribute a and O u denotes the set of items rated by user u. p (k)  ua denotes the kth preference value on attribute a from user u according to rating on the kth item from the user u, and s represents the number of the item's attribute.When the preference values of items' attributes from u 1 are obtained, a very sparse matrix is mapped to a low-dimensional space; thus the sparsity is also alleviated.

Predict the Partial Ratings Using Gaussian Mixture
Model.According to [8,10,56], whether a user is interested in an item depends on the nature of the item.For example, whether a user likes a movie may depends on three factors: whether it is an entertainment or a literary film, a foreign language, or a Chinese language film and whether the actor is famous.This article uses the Gaussian mixture model proposed by Hofmann as the basis of clustering and supposes that the conditional probability of the rating on an item's attribute a by the group z obeys the Gaussian distribution [2,40] (see (29)).
where p(r | u, a) denotes the joint probability of user u and item's attribute a. z k is a potential group that has not been observed.p(z k | u) denotes the probability that the user u belongs to the group z k .For each k, p(z k | u) represents the probability that user u belongs to the group z, and ∑ K k=1 z k = 1.p(r |  a,z k , Σ a,z k ) indicates that the rating of each user in group z obeys a Gaussian distribution, which represents the conditional probability of the ratings on the attribute a from the group z.
According to the Bayesian formula, the joint probability density formula can be obtained as follows (see (30)): where p(u) and p(a) are both constants.Therefore, the joint probability density function can be written in the following form (see (31)): where  represents the general name of all parameters.Loglikelihood function is described as follows (see (32)): Then we use the parameter  to optimize the function.
To obtain the values of each parameter (k, p(z k | u),  a,z , Σ a,z ), we use EM algorithm to alternately execute step E and step M to solve the parameters as follows [10,56] (see ( 33)- (36)).

E-Step:
M-Step: a,z = ∑ {r  ,a,r}:a  =a rp (z k | u, a, r) Execute E and M steps alternately, until L()-L()  () which is less than a given value of  converges, the iteration ends, and the model parameters are obtained.
Finally, the predicted rating on attribute a from user u is as follows [10] (see (37)): (37) where  ka denotes group-attribute average value.In a similar way, we can predict the unrated items by using the mixture model.
Next, an example is used to illustrate how to predict user ratings using a Gaussian mixture model as Figure 5.A useritem rating matrix contains 80 ratings on a scale from 1 to 5 of 6 movies by 20 users, and each movie has several different genres, e.g., comedy, crime, action, and romance.The ratings on movies from users are converted to users' preferences on movies' attributes according to (28).Then three attributes are selected as features to train the data; suppose that each user has multiple different interests and three groups are divided on the basis of Gaussian mixture model.In Figure 5, different colors represent different groups, respectively.If we will predict the rating R 11,2 , i.e., rating on item i 2 from user u 11 in the user-item rating matrix.According to (33), we can obtain the probabilities that user u 11 belongs to group one, group two, and group three and is denoted as z 1 , z 2 , and z 3 .The probabilities are 1.169e-72, 0.133, and 0.867.The first probability is close to zero, and it is ignored for ease of calculation.Therefore, R 11,2 = p 1  1,2 + p 2  2,2 + p 3  3,2 = 0 + 0.133 × 3.67 + 0.867 × 4.75 = 4.67.Users Ｏ 1 Ｏ 20 are clustered ∼ from Ｏ 11 in user-item Figure 5: The transfer from user-item rating matrix to preference for user-item attributes.

Reliability of Ratings Prediction.
Until now, initial unknown ratings are predicted in the mixture Gaussian model.However, sometimes the predicted ratings as the above method are not very accurate, especially when an item is rarely rated by users.Therefore, these predicted rating data is unreliable, and we will remove the reliable data from the user-item rating matrix.
Definition 5 (reliability of rating prediction (RRP)).In the each component of the mixture Gaussian model, when the proportion of the number of rated item in the number of the current component is less than  ( ≤ 0.1 in this paper), the predicted ratings on these items are unreliable.The definition is as follows (see.(38)): For R ui ∉ RRP, we will set the ratings to zero; in other words, they are considered as unknown ratings, so that they are predicted by the matrix factorization method in Section 3.4.
The EM algorithm of initial ratings prediction is described in Algorithm 1.

Complete Matrix with Matrix Factorization Algorithm.
In Section 3.3.2,Gaussian mixture model is used to predict unknown ratings from user's perspective.However, when some users have few or no ratings on items, it is difficult to predict unknown ratings according to nearest neighbors and the ratings obtained are inaccurate.In this section, inspired by [13,18,19,21,22,46,51], an enhanced social matrix factorization model fusing user's social status and item's similarity is proposed, which is called ESMF, to predict all unrated items.The graphical model is shown in Figure 6.The enhanced social matrix factorization model uses the rated items and trust relationships between users to train model parameters and predicts unknown ratings by using the trained model parameters.It avoids the problem of inaccurate prediction because some users have few ratings on items and some items are only rated by few users.

Motivation.
To precisely capture user latent feature and item latent feature, a novel enhanced social matrix factorization method is proposed.The method uses the individual trust with social status and item's social relationship to optimize the solution in both user latent feature space and user-item rating space, which is our main contribution in this section.Firstly, to accurately reflect the impact of users with different social status on user decisions, a regularization term is added in (9) to minimize the differences among the latent feature of trusted users.In addition, inspired by the user's social relationship, an item social relationship matrix S is constructed, which is used to improve the item's latent feature vector V.This is because there is also a relationship between the items.In real life, when people buy products, they Calculate the posterior probability that user u j belongs to group z k based on Eq. (32).6 e n df o r 7 fori = 1, 2, . . ., k do 8 Calculate new posterior probabilities based on Eq. ( 33). 9 Calculate new mean vectors based on Eq. ( 34). 10 Calculate new covariance matrices based on Eq. ( 35).

11
end for 12 update model parameters p, , Σ. 13 until convergence 14 Mark cluster label probability according to 16 Reset the unreliable ratings to zeroes based on Eq. (38).17 return R.
Algorithm 1: The EM algorithm of initial ratings prediction.will consider similar or substitute products in many cases.
Based on this consideration, the item's social relationship is introduced into the matrix factorization model.Therefore, the proposed approach accurately and realistically models real-world recommendation processes.

Calculate the Social Status of Users in Trust Networks.
The user's social status is a very important concept, which reflects the importance of a user in social networks and the degree of an individual attachment to other individuals in social networks.The social status theory is used to explain how the user's social level influences the establishment of trust relationships between users.Usually, in the social network, high-level users usually belong to the authority users, and low-level users are more likely to establish trust relationships with their higher-level users.For instance, in a social network, user v is an authoritative scholar in terms of historical research but is a beginner in terms of computer technology.Therefore, the user u will accept suggestion of the user v when selecting historical books but will not accept suggestion of the user v when choosing computer books.
In social networks, users with higher social status usually provide valuable information to users with lower social status; therefore, they have a lot of in-degrees and out-degrees.In fact, users with lower social status usually refer to the suggestions of users with higher social status, and thus they have more out-degrees [56].In this study, we employ the PageRank algorithm to calculate the social status of each user in social networks as follows [17,56,57] (see (39)): where PR u denotes the value of the user u's PageRank and T(u) denotes the set of the user u who trusts friends.N represents the number of users, and  is the probability of jumping out of the currently trust network, which is range of [0, 1].Definition 6.The trust relationships with social status between users u and v.The higher a person's social status in a field is, the greater the influence is, and the more likelihood that others will accept his suggestion.In addition, the number of common ratings is considered as interaction relationship factor between two users.The more the number of common rating items is, the more similar the rating is, and the closer their interests get.A trust network can be constructed based on the combination of the trust statements and the similarities between users.Therefore, the trust statement with social status between users and common interests of users is defined as follows (see ( 40)): where I u and I v denote the sets of items rated by users u, v, respectively, and I uv denotes the set of common items rated by users u and v. sim uv denotes the similarity between users u and v that is calculated using (21).
The higher the status of the user v in social networks is, the higher the credibility of the user v is.For instance, if the values of PR u , PR v , and PR w are 0.8, 1, and 0.6, respectively, the user u is likely to accept the recommendation of v rather than w.Otherwise, the possibility that user v accepts the suggestion of u is less than the possibility that user u accepts the suggestion of v.It is noticed that PR u and PR v are normalized values of social status.

The Proposed Enhanced Recommendation Model Based on Social
Networks.Similar to other matrix factorization methods [18,21], zero-mean spherical Gaussian priors are placed on user and item feature vectors as follows (see ( 41)-( 42)): Hence, through a Bayesian inference, the posterior probability of latent feature vectors of U and V can be obtained as follows (see (43)): where I R ui is the indicator function that is equal to 1 if u has rated i and equal to 0 otherwise.For the user latent features, there are two influence factors: the zero-mean Gaussian prior to avoid overfitting and the conditional distribution of user latent features given the latent features of his trusted neighbors (see (44)).
Similar to (42), through a Bayesian inference, the posterior probability of latent feature vectors given the rating and social trust matrices can be obtained as follows (see (45)): Similarly, item's similarity is introduced into matrix factorization model.The idea of social matrix factorization is to obtain a high quality f-dimensional feature vector V of items based on analyzing the item's similarity matrix S. Let V ∈ R f×M and Z ∈ R f×M be the latent item and auxiliary feature matrices.The conditional distribution over the observed item social network relationships is described as follows (see (46)): where S ij denotes the similarity between items i and j.According to Figure 6, zero-mean spherical Gaussian prior on auxiliary feature vector is as follows (see (47)): Hence, through a Bayesian inference, the posterior probability of latent feature vectors given the item's similarity matrix can be obtained as follows (see (48)): Based on Figure 6, using a Bayesian inference, the posterior probability of latent feature vectors of ESMF given rating, item similarity, and social trust matrices is described as follows (see (49)): The log of the above posterior probability can be computed, and the parameters (observation noise variance and prior variance) are kept fixed, then maximizing the logposterior over latent features of users and items is equivalent to minimizing the following objective function (see (50)): Among them, T * uv represents the degree of users u trust in v, whose formula is shown in (39).N and M are the number of users and items, respectively.g(x) is the logistic function g(x) = 1/(1 + e −x ), which bounds the range of U T V within [0, 1].The value of R ui is also normalized to the range of [0, 1] using the function f(x) = (x-1)/(max-1), and max is the maximum of ratings in the user-item rating matrix.U u and U v are the latent features of users u and v, respectively, and ‖U‖ 2 F , ‖V‖ 2 F , and ‖Z‖ 2 F are the Frobenius 2-norm of matrices U, V, and Z respectively.
Then the stochastic gradient descent method is used to optimize the aforementioned objective function as follows (see ( 51)-( 53)): where g  (x) is the derivative of logistic function; g The proposed hybrid recommendation algorithm is summarized in Algorithm 2.

Complexity Analysis.
The amount of data available in many practical applications of RS can be enormous and the scalability of recommendation algorithms is a crucial factor for a successful system deployment.Therefore, considering the efficiency of execution of the system, one has to distinguish between the offline and online computational complexity of an algorithm.In this paper, filling the rating matrix using improved CF recommendation algorithm, calculation of the E-step and M-step, and model training of matrix factorization are all executed in the offline stage, and rating predictions of EM and matrix factorization algorithms are executed in online stage.
It is very sparse for the original user-item rating matrix, and it needs to use the CF algorithm to calculate the user's similarity and item's similarity offline to fill the user-item rating matrix.The computational complexity is O(r * r), where r is the average number of ratings per user, and it is a very small value.Analyzing the offline complexity of the EM algorithm requires first of all to calculate the E-step and M-step, respectively.For a single E-step, the computational complexity is O(k * N).In the M-step, the posterior probabilities for each rating are accumulated to form the new estimates for p(z k | u), and the M-step also requires O(k * N) operations.Here, k is the number of clusters, and N is the number of users.For the enhanced social recommendation model, the main computation is to estimate the objective function L and its gradients against variables.Because of the sparsity of R and T, the computational complexity of Input: ratings in training set R, user social matrix T, k,  T ,  S , learning rate .Output: user latent feature matrix U and item latent feature matrix V 1 Initialize ,  T ,  S ,  Z , k, U, and V. 2 Fill the rating matrix based on Eq. ( 2) and Eq. ( 22) 3 Obtain the preferences on item attribute based on Eq. ( 28) 4 Predict partial ratings based on Eq. ( 37) 5 Reset the unreliable ratings to zeroes using Eq. ( 38 evaluating the object function L is O( R f +  T f +  S f), where  R ,  T , and  S are the numbers of nonzero entries in matrices R, T, and S, respectively.f is the dimensions of the latent feature vectors.The computational complexities for gradients L/U u , L/V i , and L/C j in (( 50)-( 52) and O( R f), respectively.Therefore, the total computational complexity in one iteration is O( R f +  T f +  S f), which is linear with respect to the number of observations in the two sparse matrices.
More important for a lot of applications is the online complexity of computing predictions in a dynamic environment.For a prediction r ua in (36), p(z k | u) and  ka are assumed to be explicitly available as part of the statistical model, this requires 2k arithmetic operations, so the computational complexity is O(k).For the enhanced social matrix factorization algorithm, the computational complexity is O(f * M) for one iteration, where M is the number of items.Thus, the total online computational complexity of the proposed hybrid system in this paper is O(k + f * M).
Compared with the existing recommendation algorithms, online computational complexity in this paper is superior to that of algorithms in [18,19,41,46], and they are on the same order of magnitude.Offline computational complexity of the proposed hybrid system exceeds that of most traditional methods, such as user-based CF, item-based CF, SVD, and kmeans CF; this is because our method only fills very small amounts of data using CF and EM.Compared with the most social matrix factorization algorithms, offline computational complexity of these algorithms is all on the same order of magnitude.The results of complexity analysis show that the hybrid algorithm proposed in this paper is efficient and can be extended to larger datasets.

Experiment
In order to evaluate the performance of the proposed method in this paper, several experiments are performed to show the effectiveness of our proposed method.In particular, the proposed method is compared to other major existing recommendation approaches in terms of their recommendation performances on Epinions and Tencent datasets.
4.1.Datasets.In this paper, two real-world availability datasets, namely, Epinions and Tencent datasets, are used for conducting the experiments.The two real-world datasets contain the trust statements among users, so that the information can be integrated into improved similarity calculation and the enhanced social matrix factorization model to improve the accuracy of recommendations.
Epinions.com is a consumer opinion site that was established to facilitate knowledge sharing about products.Users on Epinions can write reviews about items (e.g., foods, books, and electronics) and assign numeric ratings, which range from 1 to 5, to these items [30].Moreover, these users can also express their trust statements with the other users.The values of the trust statements in this dataset are 0 or 1.The extracted dataset from the Epinions website consists of 1,261,218 ratings rated by 12,630 users on 3,620 different items.The sparsity is 97.24%.
Tencent dataset is from the track 1 task of the 2012 KDD, provided by Tencent microblog.Tencent microblog, an online social networking site similar to Facebook and Twitter, has become an important communication platform for making friends and sharing information [17].This dataset is sampled from 50 days of behavioral data of about 200 million registered users, including about 2 million active users, 6,000 items, and 300 million records of historical activity.A smaller dataset is extracted, which contains 326,560 ratings data from 9,650 users on 1,650 items.The sparsity is 97.95%.

Evaluation Measures.
In this paper, Mean Absolute Error (MAE) is used for evaluating the performance of the proposed methods.The MAE measure for the user u is calculated as follows [19] (see (54)): where R ui and Rui are real and predicated ratings of the item i for the user u, respectively.N is the total number of ratings that are predicated by the recommendation method.
In addition, precision and recall are also widely used metrics in recommender systems.The evaluation metrics are averaged over all users.The items are sorted according to their ratings from the largest to the smallest, the top N items are recommended for the current active user, and the N items are compared with the most relevant items to in the test set.The larger the value of precision@N (P@N) is, the higher the accuracy is.Recall@N (R@N) describes how many percentage of related items are included in the list of the recommendation.The calculations of accuracy and recall metrics are depicted as follows [54] (see ( 55)-( 56)): R@N =     relevant items ∩ top@N itmes     |relevant items| (56) where top@N items and relevant items are the recommended and actually liked list, respectively.2 shows the details about the parameters used in all methods, including their meanings and the default values. of N u (i.e., N u = 10, 50, 100, 150, 200, 250, 300, 350, and 400) on the mentioned measures for the Epinions and Tencent datasets.Figure 7 shows the results of the MAE measure based on different values of the parameter N u for both of the mentioned datasets.As the number of the nearest neighbors grows, the MAE measure gradually increases; however, when the numbers of the nearest neighbors exceed 200 for the Epinions dataset and 100 for Tencent dataset, respectively, the MAE measures begin to decrease, indicating that some weak similarity neighbors are introduced to increase noisy data, and their performances have begun seriously degraded.
Figure 8 shows the performance of the CF method on different datasets through the use of different similarities as (1), (3), and (19), respectively, i.e., cosine, PCC, and the improved cosine similarity.The numbers of nearest neighbors N u are set to 200 and 100 on Epinions and Tencent datasets, respectively.It can be seen that the accuracy of the recommendations has been significantly improved.It indicates that the proposed improved cosine similarity method is superior to the traditional cosine similarity and PCC method on the Epinions and Tencent datasets.
Moreover, the parameters , , and  are important parameters which control the influences of the common item set rated by users, time attenuation factor, and the tradeoff between the similarities and the trust relationships among users, respectively.We set  from 0.5 to 5,  from 0.01 to 1, and  from 0 to 1 to evaluate the performance of the experiment.As shown in Figure 11, it can be concluded that the recommendation accuracy can achieve its better performance when  = 0.4; i.e., the trust relationship and the similarity reach a balance on the Epinions and Tencent datasets, respectively.The regularization constant of user feature in [18][19][20][21][22].  U = 0.1

𝜆 Z
The regularization constant of social relationships between items. Z = 0.1

𝜆 C
The tradeoff parameter plays the role of adjusting the effects of interpersonal trust between users in [21,22].
The tradeoff parameter plays the role of adjusting the effect of social relationships between items.

𝛼
The tradeoff parameter plays the role of adjusting the effects from recommendations of neighbors and trusted friends in [18].

N u
The number of nearest neighbors for the active user.100

𝛾
The tradeoff parameter controls the influence of the common item set rated by users.

𝜔
The tradeoff parameter controls the influence of time attenuation factor.

𝜑
The tradeoff parameter controls the influences from the similarities and trust relationships among users,

𝜇
The tradeoff parameter controls the effects of similarity between users and the time-weighted similarity between users. = 0.5

𝛿
The tradeoff parameter controls the effects of the error caused by the estimated values in [23].
In order to obtain a more optimized combination of parameters, we choose the appropriate value for parameters combining from a single parameter optimal interval.Then, the experimental results of the parameters combination performance are presented in Table 3.
Secondly, in order to evaluate the influence of number of clusters on recommendation accuracy, several experiments are conducted on the Epinions and Tencent datasets, and the value of k is set from 1 to 35.As shown in Figure 12, it can be seen that the predicted rating has a lower predicted error when k is set to relatively small values, and the value of MAE grows quickly when the value of k exceeds 8.It can be seen that the predicted error reaches the minimum when the value of k gets closer to 15.
Thirdly, in order to evaluate the influence of the parameters  T and  S on recommendation accuracy, experiments are conducted through setting different values of  T and  S .Among them,  T balances the information from the user-item rating matrix and the user social trust network.If  T = 0, the user-item rating matrix is only mined for matrix factorization, and if  T = ∞, the social network information is only extracted to predict user's preference. S controls the influence of item's similarity on the item latent feature space V. Figure 13 shows the impact of  T when  S = 10 and  S = 20 on the MAE measure for the Epinions and Tencent datasets, respectively.As shown in Figure 13, our model obtain the lowest MAE at 0.616 when  T = 10 on the Epinions dataset and at 0.585 when  T = 7 on the Tencent dataset, respectively.
Figure 14 shows the impact of  S when  T = 10 and  T = 7 on the MAE measure for the Epinions and Tencent datasets, respectively.It can be observed from Figures 13 and 14 that the values of  T and  S affect the recommendation results significantly, which demonstrate that, fusing the item's social relationship, user's trust relationship with social status, with the user-item rating matrix greatly, improves the recommendation accuracy.As  S increases, the prediction accuracy also  In addition, the number of hidden features, i.e., f, is another important parameter affecting the recommendation performance of the proposed algorithm.f varied the range from 5 to 50 with a step value of 5 and other parameters  T = 10,  S = 15 and  T = 7,  S = 20 on the Epinions and Tencent datasets, respectively.Figure 15 shows the MAE result based on different values of the parameter f for both of the mentioned datasets.It can be observed that the value of MAE decreases at first and then gradually increases and finally tends to be stable.The overall recommendation performance decreases with the increase of f.This observation shows that although the increase of f can make the matrix factorization model show more hidden features, some noise will be introduced at the same time to reduce the accuracy of recommendation algorithm.It verifies the basic assumptions of the matrix factorization model: only a small amount of hidden factors affects the user's preferences and characterizes the item.

Performance Comparison and Analysis.
In the experiments, the proposed method (GMMESMF) is compared to the probabilistic matrix factorization (PMF) [34], reliabilitybased trust-aware collaborative filtering (RTCF) [24], matrix factorization-based model for recommendation in social rating networks (SocialMF) [18], time-and community-aware RS (TCARS) [49], context-aware recommender system via individual trust among users (CSIT) [19], imputation-based  matrix factorization (IMF) [23], and implicit social recommendation (ISRec) [22] in terms of the MAE, P@N, and R@N measures on the Epinions and Tencent datasets.PMF is proposed by Salakhutdinov and only uses the user-item rating matrix for recommendations based on probabilistic matrix factorization.RTCF is proposed by Moradi to improve the accuracy of the trust-aware RS, and a novel trust-based reliability measure is used to evaluate the quality of the predicted ratings.SocialMF is a recommendation algorithm based on social networks proposed by Jamali, which adds a trust propagation mechanism to PMF to improve the accuracy of recommendations.TCARS is a novel recommendation method that efficiently uses the time of ratings and an improved overlapping community detection rating space using the above three social factors.In addition, our model alleviates data sparsity and ensures the reliability of the predictions through filling unknown ratings based on improved CF method.The experimental results demonstrate that GMMESMF is effective.Similarly, to verify the effectiveness of our model under cold-start users, an experiment is conducted on cold-start users by using the Epinions and Tencent datasets, and comparative results for different models are shown in Table 5.
Here, we define the users who have rated no more than 3 items as cold-start users [11,13].
It is because that the GMMESMF model is able to decrease the recommendation error by using the combination of improved CF filling method based on trust relationships and matrix factorization.

Conclusions
In this paper, a novel hybrid method is proposed to improve the accuracy of RS.A constrained similarity measure is proposed which is based on cosine similarity, salton factors, and trust relationship.In addition, a novel multiple steps filling method is also proposed to improve the prediction based on the assumptions that a user has multiple interests, similar users have the same preference, and similar items are liked by users with the same interests.The proposed method first uses the user-based CF and the item-based CF to fill in the user-item rating matrix and then uses the Gaussian mixture model to predict ratings to reduce the sparsity of rating matrix.Finally, an enhanced social matrix factorization method is proposed to predict the ratings of unrated items, which fuses user's trust relationship with social status and item's social relationship into matrix factorization algorithm, aiming at improve the accuracy of recommendation through mining the intrinsic connections from the user-item rating matrix and users interaction with items.Extensive experiments are also conducted on two real-world datasets, and the experimental results show that the proposed method achieves higher accuracy compared to the existing major methods in this paper.Although it has some advantages in its recommendation effectiveness, our algorithm still has some limitations and there is room for further improvement in some aspects.The limitations of our approach are twofold: first, GMMESMF needs to be prepopulated, and we have to fill in some unrated items in the user-item rating matrix before predicting the ratings for all of the unrated items.Second, our GMMESMF model faces an increased computational complexity when the similarity of too many users and items are calculated.
There will be several interesting directions to explore for our future work.We would like to develop a novel k-NN graph construction algorithm that reduces computational complexity and extend the model to make recommendation based on social networks integrating multiple context information.Furthermore, our future study focuses on constructing recommendation models from the perspective of users, such as social relationships between users, social tags, and item's attributes which will be considered and further investigated.

Figure 1 :
Figure 1: The rationales of user-based and item-based CF algorithms.
Predict unknown ratings using enhanced social matrix factorization User-item rating matrix Fill partial unknown ratings using user-item rating matrix and social networks User-based CF Item-based CF Convert the users' preferences for the items into the users' preferences for items

Figure 3 :
Figure 3: A hybrid framework of RS.
rating matrix User-item's attribute preference matrix The preferences on each Item's attribute matrix The attribute's preferences on

Figure 4 :
Figure 4: The transition from the user-item ratings to user-item' attribute preferences.

Figure 7 :
Figure 7: The effect of parameter N u on the MAE.

Figures 9, 10
, and 11 show the MAE measures based on the different values of the parameters , , and  on improved CF recommendation algorithm, while N u is set to 250 and 50 for the Epinions and Tencent datasets, respectively.As shown in Figures 9, 10, and 11, the accuracies reach the highest level when  = 2.5,  = 0.05,  = 0.4 and  = 3.5,  = 0.08,  = 0.4 on the Epinions and Tencent datasets, respectively.

Figure 9 :
Figure 9: Impact of parameter  on the MAE.

Figure 10 :
Figure 10: Impact of parameter  on the MAE.

Table 1 :
An example of user-item rating matrix.
Select attribute ； 2 , ； 3 ,and ； 6 as  2 and  3 on Ｃ 2 from Predict rating on item Ｃ 2 I uv         I u     +     I v     * sim uv * PR v * t uv PR v > PR u     I uv ← U i -(L/U u ) based on Eq. (51)9update V i ← V j -(L/V i ) based on Eq. (52) 10 update Z j ← Z j -(L/Z j ) based on Eq. (53)Algorithm 2: A hybrid recommendation algorithm based on Gaussian mixture model and enhanced social matrix factorization technology.

Table 2 :
Description and default values of parameters.

Table 3 :
Parameters combination performance on Epinions and Tencent datasets.

Table 4 :
MAE and P@N results on Epinions and Tencent datasets.

Table 5 :
MAE on Epinions and Tencent datasets for cold-start users.