A Deep Ranking Weighted Multihashing Recommender System for Item Recommendation

Collaborative filtering (CF) techniques are used in recommender systems to provide users with specialised recommendations on social websites and in e-commerce. But they suffer from sparsity and cold start problems (CSP) and fail to interpret why they recommend a new item. A novel deep ranking weighted multihash recommender (DRWMR) system is designed to suppress sparsity and CSP. The proposed DRWMR system contains two stages: the neighbours' formation and recommendation phases. Initially, the data is fed to the deep convolutional neural network (CNN). The significant features are extracted from CNN. The CNN contains an additional layer; the hash code is generated by minimising pairwise ranking loss and classification loss. Therefore, a weight is assigned to different hash tables and hash bits for a recommendation. Then, the similarity between users is obtained based on the weighted hammering distance; the similarity between users helps to form the neighbourhood for the active user. Finally, the rating for unknown items can be obtained by taking the weighted average rating of the neighbourhood, and a list of the top n items can be produced. The effectiveness and accuracy of the proposed DRWMR system are tested on the MovieLens 100 K dataset and compared with the existing methods. Based on the evaluation results, the proposed DRWMR system gives precision (0.16), the root mean squared error (RMSE) of 0.73 and the recall (0.08), the mean absolute error (MAE) of 0.57, and the F − 1 measure (0.101).


Introduction
Recommendation systems (RS) have recently become common on numerous websites, recommending movies, e-commerce, music, and television programs [1]. Based on the information provided by the user, the RS recommends items for purchase [2]. Several RS have been introduced to predict the behaviour of users and provide better recommendations [3,4]. In general, RS recommends, based on the individual interests of the user and their previous usage history, finding items with the highest preference [5]. RS is a great machine learning system to increase product sales [6,7]. Recommendation helps the user to speed up the search process and makes it simple for them to obtain content that is interesting to them, as well as provide them with offers they would not have searched for [8,9]. Furthermore, companies may attract customers by showing movies and TV shows relevant to their profiles [10].
'e approaches to recommendation may be categorised as collaborative filtering (CF), content-based (CB), and hybrid based on the type of data gathered and how it is used in the RS [11]. CB filtering is frequently used in the RS design, which uses items' content to select general characteristics and qualities that suit the user profiles [12]. 'en, users compare items to previously liked items and recommend the best matching-items. 'e CF system works based on a user-item relationship. Hybrid systems utilise both forms of information to prevent issues that arise when only one type is used.
Although the recommendation system has demonstrated its usefulness in various fields, it faces some problems: sparsity and cold start problems (CSP). As the name implies, a sparsity problem develops when customers do not rate an item while purchasing online, resulting in a sparsity of available ratings. 'e CF method goes through this problem because it uses a rating matrix. A CSP arises when a new product is added to the recommendation system, and no past ratings for that product are available. As a result, an RS must provide options for new users, which makes the accuracy of the recommendations low. Several research groups are working to develop an efficient and well-organised RS algorithm. 'e methods like local sensitive hashing (LSH) [13], Bayesian personalized ranking (BPR) [14], Cofi Rankmaximum margin matrix factorization (CRMF) [15], etc. are used for removing sparsity and CSP. However, the existing methods fail to generate deep hashing by minimising classification loss, ranking pairwise, and high storage costs, motivating us to develop an efficient method for recommending items.
1.1. Contribution. 'e following are the research work's contributions: (i) 'e hash code is generated by reducing the classification and pairwise ranking loss, which can handle the users with cold start and sparsity problems. (ii) A weight is introduced to hash tables and hash bits according to their performance. (iii) 'e weighted Hamming distance is used to determine user similarity. (iv) Bit diversity and similarity preservation are utilised to calculate the hash bit's weight, and the MAP score is used to evaluate the weight-based table-wise. (v) 'e proposed DRWMR system performs better for the item recommendation.
'e following is a summary of the paper's structure: Section 2 provides an overview of existing works. Our proposed DRWMR System is explained in Section 3. In Section 4, the proposed method is subjected to experimental analysis. Finally, the paper concludes in Section 5.

Literature Review
Da'u et al. [16] proposed a deep learning method that uses Aspect-based opinion mining (ABOM) in a recommendation system. 'is method contains two portions. 'ey are rating predictions and ABOM. In the first portion, we utilised multichannel deep convolutional neural networks for extracting aspects. In the second portion, we add aspect-based ratings into a machine's tensor factorisation to predict overall ratings. 'is method increases the accuracy of the recommendation system, but the time consumption is more significant.
Ye and Liu [17] introduced Collaborative Topic Regression (CTR) and three novel granulation methods for the recommendation strategy. 'e three granulation methods are LDA-based, PMF-based, and CTR-based, based on building granular structures and conducting granulation. LDA-based and PMF-based methods are developed to extract granular features from content and feedback information. 'e CTR-based method is designed to be joined with LDA-and PFM-based methods to produce multilevel recommendation information and interpretable granular features. 'ese methods were introduced to overcome the problem of time cost and decision cost. However, this method does not consider users' dynamic preferences.
Chen et al. [18] offered a Dynamic Decay CF (DDCF) recommendation method based on the user's interest. 'is method has four stages: clustering of items, identification of the interesting level, specification of the decay function, and preference prediction. In item clustering, similar items are grouped without any predefined parameters. 'en, in the second stage, based on the number of ratings and time records in the cluster, we can identify the interesting level of users. In the third stage, decay functions describe the preference evolution at every level. Finally, the similarities between the users are calculated based on the decay rates, and future preferences are predicted. 'is method accurately predicts the user's interests, but the time consumption is high.
Beg et al. [19] introduced a chaotic based reversible data transform method (CRDTM) for preserving privacy in data mining for the recommendation system. 'is method will dynamically produce the values of RDT parameters during processing time, but it does not need the parameters during recovery. 'is method can substitute for the standard RDT algorithm, in which memory and bandwidth are considered significant factors. 'e RDT algorithm is used in the mobile app recommendation area. 'is method decreases the execution time, but accuracy is low.
Abbasi-Moud et al. [20] offered a context-aware recommendation system for tourism. It consists of three steps. 'Users' preferences are extracted from their reviews and comments in the first step. In the second step, tourist 'attractions' characteristics are extracted from the tourist reviews. In the third step, recommendations are given based on the 'user's preference and its similarity with the tourist attractions' characteristics and contextual information. 'e contextual information utilised in this method includes location, weather, user preferences, and time. 'is method gives high accuracy in tourism recommendation. However, the weather may change according to the season, so it's challenging to make a recommendation.
Zhang et al. [21] introduced a novel hybrid probabilistic matrix factorization method for distinguishing between items' attractiveness and users' preferences for a recommendation. It consists of two sub-divisions. One division attempts to predict the rating scores of users by extracting the user's personal preferences from auxiliary data. Another division attempts to model the textual interest of items for different users. 'ese two sub-divisions are formed into a unified framework using a global objective function. 'is method gives high accuracy. 2 Computational Intelligence and Neuroscience Choe et al. [22] offered a hierarchical model based on a recurrent neural network (HMRNN) for considering users' item usage histories in time series and sequences in a recommendation. 'is method contains two layers: a long-term (LT) and a short-term (ST). 'e STsequences are handled by the ST layer, typically found in older ones. 'e LT layer recollects the data from the preceding ST sequences and distributes it to the lasting time. 'is method increases the amount of stored user history. However, it does not take the consideration of temporal properties.

DRWMR System
We propose a novel deep ranking weighted multihashing recommender (DRWMR) system to suppress sparsity and CSP by considering CSP and sparsity issues. 'is method has two stages: the neighbours' formation and recommendation phases. Initially, the user-item rating data of the user is fed into the neighbours' formation stage, and deep convolutional neural networks are used to extract text features. 'e deep neural network consists of an additional layer (hash  table). 'e hash codes are produced by reducing the classification loss and pairwise ranking loss. 'e similarity preservation is preserved by pairwise ranking loss, and the prediction error is minimised by classification loss. Because each hash bit performs differently in the RS, it is not easy to treat them all equally. 'erefore, a weight is assigned to different hash tables and hash bits for a recommendation. 'en, the similarity between users is obtained based on the weighted Hamming distance-the similarity between users helps to get the neighbourhood formation for the active user. Finally, the rating for unknown items can be obtained by taking the weighted average rating of the neighbourhood and producing a list of the top n items. Figure 1 illustrates the architecture of the proposed DRWMR system.

Neighborhood Formation Phase
. 'e user-item rating data is given to the deep CNN, which is used to extract the features. 'en, a hash code is generated by minimising classification loss and ranking pairwise loss. 'erefore, a weight is assigned to different hash tables and hash bits. According to weighted Hamming distance, the most similar users are identified as active users.

Deep CNN.
'e convolution and max pooling layers are used to extract the features. 'e demographic data of the user is fed to the convolution pooling layer. 'e convolution pooling layer is formed with the rectified linear activation function in the first three layers. 'e max pooling process is utilised in the initial convolution pooling layers, and the average pooling process is used in the final convolution pooling layers. 'ey mostly use the original raw data to extract high-level representations. In the first three convolutional layers, we use 32, 32, and 64 with stride 1 for every convolution layer and a kernel size of 5 × 5. 3 × 3is the size for pooling operations, while 2 is the stride for each pooling layer. 'e fully-connected layer uses a rectified linear activation function and a dropout layer with a 0.5 dropout ratio.
A tangent-like activation is used in the hash code layer to output hash codes. 'e softmax function is utilised as the activation function in the classification layers to preserve semantic similarity. 'e classification and hash code layers are sequentially learned in every training epoch. In the first fully-connected layer, the unit number is 500, the hash code's length is equal to the second fully-connected layer's output, and the number of classes is similar to the third fullyconnected layer's output. A deep CNN architecture is constructed with this framework to learn the function of nonlinear transformation α( * ) using the input as data. 'e hash bit is determined by using where K denotes the raw data, and sign (y) signifies the sign function. 'e deep neural network consists of an additional layer (hash table). 'e hash codes are produced by reducing the classification loss and pairwise ranking loss. 'e similarity preservation is preserved by pairwise ranking loss, and the prediction error is minimised by classification loss. For a pair of users, the pairwise-loss function can be defined as follows: Such that a i , a j ∈ +1, −1 { } h , where a denotes the user K's binary hash code, H d (a i , a j ) signifies the Hamming distance between a i , an d a j . 'e similarity between a i an d a j is denoted by I i,j , it is defined as 'e Hamming distance between the input user e i , and its pairwise user is calculated to produce an arranged hamming list. 'e pairwise ranking loss for an input user e i is defined in the following equation:.
where M i denotes the pairwise user's number for user e i . Users similar to e i should have short Hamming distances, therefore they seem at the top of the arranged Hamming list, whereas dissimilar data seem at the bottom. On the other hand, data at the improper positions of the Hamming list ofe i has substantial loss values. We aim to decrease the overall loss function for a training database with M user, which is given in the following equation: Such that a i , a j ∈ +1, −1 { } h , i � 1, . . . . . . . . . M. Since the hash codes are in binary, the fitness function is nondifferentiable. As a result, using the gradient descent approach is challenging for optimising the fitness function.
Computational Intelligence and Neuroscience 'erefore, to overcome this problem, a tanh approximation function is used instead of the sign function. 'e tanh-like function can be used to estimatea i 's hash code, which is given as follows: 'e Euclidean distance E d (p i , )PJbetween two users, can be further estimated as the Hamming distance H(a i , a j ) using the calculated hash codes determined in the following equation: A regularisation term is included to reduce the quantisation loss. 'e final loss function is determined in the following equation: Such that a i , a j ∈ +1, −1 { } h , i � 1, . . . . . . . . . M, where α denotes the regularisation term's parameter coefficient. 'e classification layer outputs the recommendation for the category. As a recommendation function, the softmax function is used. 'e classification loss is determined in the following equation: Where o and o x ij signify the class's number and the recommended output of j's class for user e i .

Weighted Multihashing.
A weighted multihash code is used based on the loss in the hamming distance. It multiplies the bitwise weight and Similarity preservation (P) measures a hash bit's semantic similarity. 'e hash bit a r ′ ssimilarity preservation (P) is calculated in the following equation: where U signifies the training sample's number, and a b ′ signifies the a b 's transpose. 'e bit diversity is utilised to calculate the hash bit's performance. 'e difference between each user's hash bit is crucial for maintaining recommendation efficiency. 'erefore, every hash bit must be independent. 'e correlation between two hash bits is used to determine bit diversity, which can be determined by the above formula.
Initially, a bitwise correlation metric  Computational Intelligence and Neuroscience where ∅ 1 , and ∅ 2 denote the hash bit's probabilities. Eqn can determine the 1 , and ∅ 2 probabilities.
where w, x, y, and z denote the nonnegative variables indicating the sample's number that fulfills the relevant row and column requirements. 'e bit-correlated matrix M � m 1 , . . . . . . , m h ∈ Z 1×h is formed by assigning the correlated coefficient for each bit. Finally, the bita r 's bit diversity weight is calculated by the following equation: where f b signifies the b th bit's correlated coefficient. To generate a final bitwise weight WT bit b , the terms mentioned above are first adjusted and multiplied, which is determined in the following equation: To combine multiple hash tables, table-wise weight (WT bit r )for each hash table is determined using the mean average precision. It is specified in the following equation: where n signifies the size of the dataset, P denotes the similar user's number in the dataset that relates to the checking user, P b denotes the similar user's number at the top of the dataset.

Recommendation
Phase. 'e neighbours obtained from the previous phase are used to generate a recommendation. 'is similarity between the active user and its neighbours is utilised to forecast the final rating for an unknown rating of item i. In this regard, the ultimate anticipated rating provided by user u for any item i is calculated as the following: where N(u) is the user's neighbour, u and sim(u, u ′ ) denote the similarity between two users. 'e prediction value of each item can be sorted in descending order, and the top n items from this sorted list of items can be recommended to the user.
'e similarity between the users can be obtained by the following equation: sim u, u ′ � i∈I r u,i − r u r u′,i − r u′ ������������ i∈I r u,i − r u 2 ������������� i∈I r u′,i − r u′ 2 (18) where I is the set of items,r u,i rating of given to item iby user u, r u average rating of user u.

Experimental Results and Analysis
'is section contains the experimental setup and a description of the datasets for comparative analysis.

Dataset Description.
'e movie Lens 100 K dataset contains numerous users' demographic data. 'ere are 1682 movies in the dataset, with a total of 10,0000 ratings from 943 users. For this evaluation, data were gathered from the https://www.kaggle.com/prajitdatta/ movielens-100k-dataset. It consists of the user's age, ID, occupation, and items provided. 'e datasets are used for 80% of the training, while the remaining 20% is used for testing.

Simulation Setup
. 'e proposed DRWMR system is implemented in python; the initial learning rate is 0.001, and after 1000 iterations, it lowers exponentially by 0.04. 'e batch size is 200, and the parameter coefficient is 0.01.
It is the ratio of the total number of relevant recommendations to the actual or true number of relevant recommendations for a new user.
RL � F r T r (19) where F r denotes the recommendation related to the new user and T r signifies the real amount of related recommendations. (ii) Precision: 'is metric indicates the precision of the process, i.e., whether the generated recommendations are appropriate for new users.
PN � F r tot r (20) where F r denotes the recommendation related to the new user and tot r signifies the recommended item's total quantity.

Computational Intelligence and Neuroscience
(iii) F1-score: 'is score shows the experiment's accuracy based on the recall and precision measures and is calculated by the following equation: (iv) RMSE: It is measured by determining the difference between the predicted and observed values, which is given in the following equation: (v) MAE: 'e magnitude of the difference between the expected and observed values is computed to determine it, which is given in the following equation: where Q signifies the user's total amount, K j KJ denotes the predicted values, and KJ signifies the observed value.

Comparative Analysis with Existing Methods
. 'e proposed DRWMR system is tested on the movie Lens 100 K dataset, and metrics such as recall, precision, RMSE, MAE, and F1-score are compared to existing approaches such as CTR [17], DDCF [18], CRDTM [19], and HMRNN [22]. Figure 2 illustrates the RMSE analysis of our proposed DRWMR system with existing methods such as CTR, DDCF, CRDTM, and HMRNN. Our research introduces a novel loss function that minimises the classification loss and ranks pairwise losses. It produces hash codes with high recommendation accuracy and more similar information. 'erefore, our proposed DRWMR system reduces the error compared to the existing methods for the top 5, 10, and 15 recommendations. 'e proposed DRWMR system (0.73) has a low RMSE when compared with existing techniques such as CTR (0.87), DDCF (0.80), CRDTM (0.77), and HMRNN (0.75) for the top 10 recommendations. Figure 3 shows the MAE analysis. 'e smallest Hamming distance between the users results in the recommendation to the new user. 'e weighted Hamming distance is used to determine how similar the users are. So, the errors can be reduced in the recommendation system. 'e MAE is low for the proposed DRWMR system (0.57) when compared with existing methods such as CTR (0.75), DDCF (0.72), CRDTM (0.705), and HMRNN (0.653) for the top 10 recommendations. Figures 4 and 5 show the recall and precision analyses. In the proposed DRWMR system, a hash table is built as an additional layer. A table and bitwise weight method are presented based on the performance to attain a better recommendation performance. 'e proposed DRWMR system assigns weights to various hash bits and hash tables. Also, a hash code is generated based on minimising loss,   Computational Intelligence and Neuroscience which can handle users with cold start and sparsity problems. 'is mechanism attains higher precision and recall. 'e precision is high for the proposed DRWMR system    Computational Intelligence and Neuroscience Figure 6 shows the F − 1 measure analysis. 'e F-measure is high for the proposed DRWMR system (0.101), when compared with existing methods such as CTR (0.06), DDCF (0.064), CRDTM (0.077), and HMRNN (0.08) for the top 10 recommendations.

Conclusion
'e RS has become vital to social networking and business apps such as Flipkart, Amazon, YouTube, and others. 'erefore, we introduced the DRWMR system for an accurate recommendation initially. User-item rating data is fed into CNN to extract important features to reduce data sparsity. 'en, the hash code is generated by minimising pairwise rank loss and classification loss. A weight is assigned to different hash tables and hash bits. According to weighted Hamming distance, the most similar users were obtained. Finally, the rating of unknown items can be obtained using the weighted average rating of similar users and active users. As a result, we have seen that our proposed DRWMR system performs better in recommending items. 'e proposed DRWMR system is quantitatively measured by precision (0.16), recall (0.08), RMSE (0.73), MAE (0.57), and F-measure (0.101). It shows better performance in the recommendation system.

Data Availability
'e datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest
'e authors declare that they have no conflicts of interest.