A Novel Hybrid Similarity Calculation Model

This paper addresses the problems of similarity calculation in the traditional recommendation algorithms of nearest neighbor collaborative filtering, especially the failure in describing dynamic user preference. Proceeding from the perspective of solving the problem of user interest drift, a new hybrid similarity calculation model is proposed in this paper.This model consists of two parts, on the one hand the model uses the function fitting to describe users’ rating behaviors and their rating preferences, and on the other hand it employs the Random Forest algorithm to take user attribute features into account. Furthermore, the paper combines the two parts to build a new hybrid similarity calculation model for user recommendation. Experimental results show that, for data sets of different size, the model’s prediction precision is higher than the traditional recommendation algorithms.


Introduction
Traditional collaborative filtering (CF) algorithms usually calculate similarity between users or items based on user-item rating matrix, and in the light of the calculated similarity they choose the nearest neighbor and construct prediction scores to generate recommendation lists.Therefore, the similarity calculation decides the precision and quality of recommendations produced by the heuristic CF algorithm.However, the present traditional heuristic CF recommendation algorithms suffer from a range of problems in similarity calculation, such as the failure in finding changes of user interest; that is, by directly computing similarity on the basis of statistics, it considers user ratings and center ratings only while ignoring other factors when rating, such as user attributes, time weight, and user rating habits.
In order to solve the problems of similarity calculation in traditional heuristic CF recommendation and improve its performance, Luo et al. [1], Anand and Bharadwaj [2], and Lopes et al. [3] proposed the global similarity measure.Based on the traditional similarity algorithms, the global similarity measure takes the transitive relationships among users into account to calculate the global similarity and build the user's nearest neighborhood set.Results of Lopes' experiments indicated that, in case of the extremely sparse data set, the combination of traditional similarity algorithms and the global similarity measure can improve the accuracy of recommendation.Li et al. [4] proposed the concept of fluctuation factor.He considered the influence of fluctuation factors and removed the influence of them by -score method when computing the similarity between users.Shen et al. [5] proposed a two-stage similarity learning algorithm, in which at the first stage it utilizes PCC to calculate the similarity and obtains the nearest neighbor, and at the second stage it uses the reduced gradient method to learn the similarity, which improves the recommendation accuracy.Gao and Huang [6] proposed the idea based on the model of item gravity attribute.Its similarity calculation contains two parts: one of which is the similarity obtained by the traditional calculation; the other part firstly defines the weight value of the item attribute, and then the initial similarity is calculated by the model of item gravity attribute, and, after the two similarities are weighted, the effect of the rating time is taken into account to calculate the final similarity value.
Starting from different perspectives, the studies above aimed at strengthening the association between users and items to improve the similarity between users or items and get the optimal nearest neighbor set, finally improving 2 Scientific Programming the recommendation accuracy and quality on this basis.However, when strengthening the association between users and items, we can take some factors into account, such as the demographic characteristics of users and the time decay caused by the time-effect of ratings, which have certain effects on the association.It is very effective to consider user attribute features when dealing with the problem of user's cold start.
Therefore, the paper proposes a new similarity calculation method: RIT-UA algorithm.The RIT-UA algorithm consists of two parts: one is the similarities of user rating-interest, which considers the similarities of user rating and interest as well as the changes and effects of the two under the constraints of rating time and confidence coefficient between users; the other part is the similarities of the user attributes, which takes into account the influence of the user attribute feature on the recommendation and calculates the similarity of the user attributes after getting the weight of each attribute feature.In the end, RIT-UA algorithm fits the two parts linearly.The experimental results show that, compared with the traditional methods, the algorithm proposed in this paper can obtain better prediction accuracy.

Related Work
In studies of recommendation system, though in recent years the recommender systems have been studied frequently and developed sufficiently, there are still some common problems, such as data sparsity, cold start, and user interest drift.In order to deal with these problems and improve the recommendation precision and accuracy, researchers may take many aspects into account, including the basic user attribute feature and the time and place where the user behavior occurred, and researches about these came into being correspondingly.
Demographic Recommender System (DRS) is an important part of recommender systems.Demographic characteristics can be used to identify the user's type and their preferences, and the system can sort users according to their attribute features and generate recommendations based on the sorting results.DRS plays a great supporting role in dealing with the problems of user cold start and data sparsity.Many of the present studies have proved that user attribute features can improve the accuracy in recommendations.Luo et al. [7] used improved quantized kernel least mean square (EQ-KLMS) algorithm, which improved the efficiency of machine learning and improved the accuracy of weather forecast.Beel et al. [8] elaborate the role of user attribute features in the recommended process and analyze and demonstrate that the user's attribute characteristics have a significant impact on click-through rates on recommender systems.From the perspective of tourism recommendation, Wang et al. 's [9] experiments proved that the combination of machine learning method (Naive Bayes, Bayesian network, and SVM) and demographic characteristics can improve the prediction accuracy of tourism recommendations.Zhao et al. [10] used visual tracking sensors to acquire biometric information and then used machine learning based biometrics to improve the accuracy of recognition.Combined with user attribute features, Al-Shamri [11] constructed five similarity measures, respectively, based on user preference modeling method, and the experimental results showed that the combination of user attribute features improves the recommendation accuracy of recommender systems.Santos et al. [12] applied user attribute features in real recommendation environment to mine and analyze the context constraints in the scene.Chen and He [13] constructed the user demographic vector by the user information, and, on this basis, took the corated item and the item's frequency into account to figure out a new similarity.The experimental results showed that this approach can solve the problem of cold start effectively and improve the recommendation accuracy.Luo et al. [14] achieve QoS prediction with automatic parameter tuning capability by using approximate dynamic programming, through online learning and optimization, without the need for preknowledge or prediction model identification.Then, through the use of a kernel least mean square algorithm [15], the lack of Web services QoS is forecasting.The experimental results show that the method can effectively solve the cold start problem and improve the prediction accuracy.
With the intensive development of recommender systems research, in order to obtain better recommendations and improve recommendation quality, many researchers began to incorporate contextual information into the research of the recommender systems.Relatively speaking, the time information is easier to collect among contextual information, and it provides significant value for researches on improving the diversity of timing sequence of the recommender systems, which has become a hot topic in the current studies [16].Koren [17] used matrix factorization (SVD), which regards time as an important feature and add it to the feature data set of user-item, and solved the problem of user interest drift effectively.Karatzoglou et al. [18] and Xiong et al. [19] regarded the time information as the third eigenvector, employing the approach of tensor factorization to show the dynamic changes of time.According to the user's rating history, Rong et al. [20] divided it into several periods and analyzed the user's preference distribution in each period and quantified their preferences.Li et al. [21] split user preferences to stages over time and proposed the cross-domain CF framework.The experiments proved that the algorithm not only improves the recommendation prediction accuracy but also solves the problem of user interest drift.

Description of RIT-UA Algorithm
In the context of relatively sparse data, from the perspective of solving the problem of user interest drift, this paper proposes the RIT-UA algorithm on the basis of the traditional similarity calculation, with the introduction of factors (such as the user attribute characteristics and time decay of rating) which influence user's rating behaviors.The RIT-UA algorithm consists of two parts: one is the similarities of rating-interest, and the other part is the similarities of the user attributes.

The Similarities of Rating-Interest.
The similarities of rating-interest are composed of rating similarity and interest similarity, mainly considering two aspects: users' preference for items and user's rating habits.Meanwhile, based on the two aspects, the effect of time decay of rating is introduced and the confidence coefficient between users is also introduced with the combination of the fluctuation factor proposed in literature [4].In the end, the similarities of rating-interest between users are obtained.The whole process is described as follows.
3.1.1.Rating Similarity.In the field of e-commerce systems, Rating or Voting is generally used to obtain the user's direct preference for items.Assuming that the degree of user's preference for items is classified as 5 levels, which is {adore, love, like, dissatisfied, and dislike}, and the corresponding grades are {5, 4, 3, 2, 1}.Consequently, the results will produce a rating matrix.The rating preference matrix of user-item can be shown in Table 1.Table 1 is a rating matrix of user-item.In the rating matrix, when the ratings of two users are closer, it indicates that their preferences are similar.When the ratings of two users are the same, it implies that the users share the same preferences.If there is big gap between ratings, then it means that the two users have opposite preferences.Therefore, in order to describe the nonlinear correlation of the similarity between users' ratings, the paper constructs sigmoid function to express the similarity of user ratings based on the literature [22], in which sigmoid function is put forward as the expression of the similarity.In this paper, the sigmoid function is also used to represent the similarity between users.The equation is shown below: Equation ( 1) represents the similarity between the ratings for item  from user  and user V.   represents the ratings for item  from user  and  V represents the ratings for item  from user V.

Interest Similarity.
Every user has their own rating habits.For instance, some users who do not stick to rifles always tend to give a high score, while some rigorous users who pay much attention to details is likely to give a low score.Because they are more strict with the score, they do not give high scores easily.Hence, the description of user habits is helpful to improve the prediction accuracy.For the user rating habits and the inherent attributes of the item, Koren [17] used an equation to define them, as shown in equation (2), that is, regarding user's own rating habits as a factor having an impact on rating.In (2)   stands for the user's own rating habits, while   stands for the user's rating for item .
Therefore, within the range of rating for items, when a user tends to score highly and likes an object, he/she usually gives a high score for it.However, even though the user does not like the object, he/she will not give a low score and vice versa.Therefore, according to the average score given by the user for an item, his/her interest and preference of rating habits can be showed.Similarly, based on literature [22], which proposed sigmoid function as the expression of the similarity, the paper also constructs sigmoid function to express the similarity of user interests, shown in Equation ( 3) represents the similarity of the interest of user  and user V on item .Then, combining the rating similarity and interest similarity between users, we get a computational equation, shown as sim (, V)  score = sim (, V, ) rate + sim (, V, ) interest .(4) 3.1.3.Time Factor.Generally speaking, treating user behaviors that occurred at various time equally leads to the shortage of effective quantitative analysis.Time factor shows the degree of changing tendency of user interest drift.The closer the rating information to the present time, the better recommendation effects it has and vice versa.Based on this, some studies used linear and nonlinear functions to quantify the rating behaviors over time.
In the literature [23], in order to solve the difficult problem of tracking the changes of user interest, the Ebbinghaus forgetting curve is put forward for the research of user interest fitting.Changes of Ebbinghaus forgetting curve is shown in Figure 1.Based on the literature [23], combined with the trend of Ebbinghaus forgetting curve this paper uses the following function to describe the trend of user interest drift, that is, draw the impact direction of the time factor, as shown in Δ represents the time difference between users' rating on item , which is the parameter, and in this paper, we set it as 0.005.After taking time-effect into account, therefore, the new computational equation for similarities of user ratinginterest arrives: | V | represents the number of items corated by user  and user V.

Confidence between Users.
When the user data is extremely sparse and the number of corated items is very small, there is a large fortuitous factor in the similarity calculation.Li et al. [4] eliminate this effect by using the fluctuation factor.Based on this, the paper introduces the number of corated items to adjust the weight of similarity through nature exponential, shown as Equation ( 7) represents the confidence coefficient between user  and user V, stands for the item rated by user , stands for the item rated by user V, shows the corated items of user  and user V, represents the corated item between user  and the nearest neighbor, and stands for the nearest neighbor set.
After taking confidence coefficient into account, the adjusted equation to calculate the similarity of user ratinginterest arrives: sim (, V) score = sim (, V)  score ⋅ confident (, V) .(8)

The Similarity of User Attributes.
Considering the similarity of user attributes, on the one hand it can improve the accuracy of prediction, and on the other hand it can solve the problem of new user's cold start; that is, when there is no other available rating data, data of user attribute features can be used to build models and give recommendations.As for the description about the similarity of user attributes, literature [20] divided the user attributes into numerical attributes and name attributes and defined and expressed them, respectively.From the perspective of being easy to understand and implement, this paper defines the similarity of user attributes as follows.
It indicates that when user  and user V share the same attribute , the value is 1; otherwise the value is 0.
In ( 9)  is the value of feature weight of user attribute .In order to obtain all weight values of each feature attribute, this paper chooses the feature selection algorithm of Random Forest to calculate the importance degree of each user attributes feature and generates a rank of it.Then we conduct experiments according to the rank and acquire the relative importance weight value of each attribute further.

Similarity Calculation
Based on RIT-UA.Sections 3.1 and 3.2 consider the similarity of rating-interest and the similarity of user attributes, respectively; hence we carry out weighted combination for the two and get a new computational equation of similarity: In (10),  = 1 − .After the computational equation of similarity is obtained, we get the prediction equation of user to item, shown as and  V mean the average scores of user  and user V, respectively, and  stands for the neighbor set of users .
The description of RIT-UA similarity algorithm is in Algorithm 1.
Therefore, from the description in Algorithm 1 we can see that the time complexity of operating RIT-UA algorithm is ( * ), where  means the number of users and  means the number of items.

Experimental Data Sets.
Taking into account the openness and authority of data sets, at the same time, our simulation experiment is based on the scoring matrix, so we chose two data sets, namely, Movielens-100k and Netflix, to carry out experimental analysis and comparison.The process is shown as follows.

Movielens-100k Data Set.
The data set is a film rating data set provided by the GroupLens Research.The data set contains 100,000 ratings from 943 users for 1682 movies, where each user has rated 20 movies at least, and the rating interval is {1-5} which is shown as Table 2.Meanwhile, the sparseness of the data set is 1 − 100000/(943 * 1682) = 93.7%.  Figure 2(a) shows the number of items rated by users on the ML-100k data set in a descending order.From the figure, we can see that the number of items rated by many users is less than 100.In order to test the performance of the algorithm, the data set is divided into two parts: 80% as the training set and 20% as the test set.
In ML-100k data set, there are only 4 attributes about users' attribute feature: gender, age, occupation, and zip code.

Netflix Data Set.
Netflix data set is a section of the original Netflix Game data.After the proper data cleaning, the data set contains 387,939 ratings from 4861 users for 5080 objects, where each user has rated 20 objects at least, and the rating interval is {1-5} which is shown as Table 2.
The sparseness of the data set is 1−387939/(4861 * 5080) = 98.4%, and Figure 2(b) shows the number of items rated by users on the ML-100k data set in a descending order.From the figure, we can see that the number of items rated by a large number of users is less than 100.Similarly, in order to test the performance of the algorithm, the data set is divided into two parts: 80% as the training set and 20% as the test set.
In the process of cleaning the Netflix data set, since there is no user attribute feature data in it, according to the features of the user attribute data of ML-100k, this paper randomly generates data of three user attributes in Netflix through the simulation experiment: gender, age, and occupation.The range of age attribute is {10-65}, the occupation attribute has 20 occupations with the range {0-19}, and the value of gender is given within the range {0-2}.Because of the high sparsity of our data sets, we use resource scheduling and processing methods for sparse data [24,25].

Experiment Evaluation
Quantity.Generally speaking, there are evaluation quantities such as MAE (mean absolute error) and RMSE (root mean squared error) in the experimental evaluation about prediction precision in recommender systems.After comparison, RMSE (root mean squared error) is used as the evaluation quantity in this paper.The equation is | Test | represents the size of test data set and refers to the real rating value while referring to the predicted rating value.The smaller the value shown by RMSE, the higher the predicted precision; that is, the smaller the value, the closer the prediction.

Experiment 1: Experimental Analysis of the Weight Value
of User Attribute Feature.From (9) we can see that, in order to obtain the weighted value of each user attribute feature, the Random Forest algorithm is chosen in this paper.
Random forests are an ensemble learning method that can analyze the complicated interactive feature data, even under the influence of certain data noise it is very robust, and it is very efficient in feature learning and analysis.Its variable importance measure can be a feature selection tool for high dimensional data.In recent years, it has been widely used in various kinds of prediction, feature selection, and outlier detection [26].
Therefore, we obtain the weight value of each user attribute feature with Random Forest algorithm on ML-100k and Netflix data set.The experimental results are shown in Figures 3 and 4.
On ML-100k data set, from Figure 3 we can see that among the 4 attributes (age, gender, occupation, and zip code) gender is the most important, indicating that gender attribute exerts a significant role in recommendation and the user rating is more similar when it relates to this attribute.Compared to the gender attribute, zip code attribute exerts a relatively low role in recommendation so its weight value is low correspondingly.But the other two attributes age and occupation show relatively medium influence of feature weight, as the experiment implies whose weight value is about 0.284 and 0.186, respectively.
The illustration parts of Figures 3 and 4 show the domain of walker of possible weight values for each feature.For the Netflix data set, the gender and age attributes have very obvious effects in recommendation, and the overall importance rank of the weight value is similar to ML-100k.
In order to test the relative optimal weight values of every and each attribute of (age, gender, occupation, and zip code) and (age, gender, and occupation) on the ML-100k and Netflix data sets, we carry out several sets of comparative experiments in this paper, and experimental results are shown in Figures 5 and 6.From Figures 5 and  6 we can see that on ML-100k data set when "age, gender, occupation, and zip code" are given the values "0.3, 0.3, 0.25, and 0.15," respectively, we get better experimental results.As for Netflix data set, when "age, gender, and occupation" are given the values "0.5, 0.4, and 0.1," respectively, we get better experimental results.Therefore, the values above will be used in the following experiments.

Experiment 2: Experimental Analysis of the Weight
Value of Alpha and Beta.According to (10), in order to get the values of  and  which will generate relatively good experimental results, we used the RIT-UA algorithm to carry  out the following groups of experiments based on ML-100k and Netflix data sets.The experimental results are shown in Figures 7 and 8.
From results shown by Figures 7 and 8 we can see that, for ML-100k data set, when  = 0.75 and  = 0.25, we get relatively better experimental results.And, for Netflix data set, when  = 0.7 and  = 0.3, the relatively better experimental results are obtained.

Experiment 3: Experimental Analysis of the Comparison
with Other Similarity Measures.In order to verify the validity of the algorithm proposed in this paper, we compare it with other similarity measures, including the Pearson similarity, the adjusted cosine similarity (Acosine), the PIP [27] similarity, and the NHSM [28] similarity on the ML-100k and Netflix data sets.The experimental results are shown in Figures 9 and  10, respectively.
From Figure 9 we know that, on ML-100k data set, the overall experimental results show that, with the increase of neighbors, the algorithm of this paper outperforms others  gradually.At the beginning stage when the number of neighbors is within [10,30], the results of the algorithm proposed by this paper are close to those of PIP but slightly better than that of PIP in later period.The experimental results of NHSM are good when the number of neighbors is within [10,30] but worse in later period.The experimental results of PCC and Acosine are worse than other algorithms.On Netflix data set, from Figure 10 we know that the algorithm proposed in this paper outperforms other algorithms gradually with the increase of neighbors.NHSM outperforms others when the number of neighbors is within [10,40] but performs not so well as the algorithm proposed by this paper later.

Experiment 4: Comparison of Precision on Data Sets of Different Sizes.
Based on ML-100k data set, the paper chooses 20%, 40%, 60%, and 80% of the data set, respectively.Neighbors  = 20 as a prerequisite, and we verify the comparison of precision of different algorithms on data set of various sizes.Fivefold cross validation is used to get the average value of experimental results, which is shown as Figure 11.From Figure 11 we can see that the proposed algorithm produces better and stable results on varied sizes of ML-100k data sets, indicating that, in the case of sparse data, the proposed algorithm has higher identification degree.As for the other three algorithms, performance of PIP algorithm is relatively stable, and RMSE value is relatively low.However, for NHSM algorithm, RMSE value is higher when the data set is relatively sparse, while, with the sizes of the data set increase, the NHSM algorithm performs better and becomes more stable.

Conclusion
Aiming at some problems in traditional similarity calculation, this paper proposes a new similarity calculation model.The model describes and expresses aspects such as user rating preference, user rating habits, and time factor.Furthermore, user attributes feature is taken into account for its influence on user ratings, and the role of each attribute feature played in recommendation is studied.Then Random Forests algorithm is used to calculate the weight value of each attribute.The final experimental results show that, compared to other similarity measures, the approach proposed in this paper improves the recommendation precision significantly, and even in the case of sparse data it still shows better experimental results.The deficiency of experiments is that since the user attribute data is relatively small in data set, there is no obvious difference when calculating the feature weight value of user attributes, as the part of user attributes data is private and not easy to obtain, which inevitably cast a shadow on the experiments.

Figure 2 :
Figure 2: Changing tendency of the number of user-rated items (descending order).

Figure 3 :Figure 4 :
Figure 3: Ranking of the weight value of user attributes feature (ML-100k).

Figure 6 :
Figure 6: Experiment comparison of weight values of different user attributes (Netflix).

Figure 11 :
Figure 11: Comparison of results produced by different algorithms on data sets of different sizes (ML-100k).

Table 1 :
User-item rating matrix.

Table 2 :
User-item rating matrix.