1. Introduction

IJDMB

International Journal of Digital Multimedia Broadcasting

1687-7586 1687-7578

Hindawi

10.1155/2019/1326831

1326831

Research Article

QoS Prediction for Neighbor Selection via Deep Transfer Collaborative Filtering in Video Streaming P2P Networks

http://orcid.org/0000-0002-6659-5727

Wenming

¹ Zhang

Qian

¹ Mu

Chunxiao

¹ Zhang

Meng

² Cao

Yuanlong

School of Computer and Control Engineering

Yantai University

Yantai 264005

China

ytu.edu.cn

China National Nuclear Corporation

Beijing 100045

China

cnnc.com.cn

2019

112019

2019 21 09 2018 25 10 2018 112019

2019

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

To expand the server capacity and reduce the bandwidth, P2P technologies are widely used in video streaming systems in recent years. Each client in the P2P streaming network should select a group of neighbors by evaluating the QoS of the other nodes. Unfortunately, the size of video streaming P2P network is usually very large, and evaluating the QoS of all the other nodes is resource-consuming. An attractive way is that we can predict the QoS of a node by taking advantage of the past usage experiences of a small number of the other clients who have evaluated this node. Therefore, collaborative filtering (CF) methods could be used for QoS evaluation to select neighbors. However, we might use different QoS properties for different video streaming policies. If a new video steaming policy needs to evaluate a new QoS property, but the historical experiences include very few evaluation data for this QoS property, CF methods would incur severe overfitting issues, and the clients then might get unsatisfied recommendation results. In this paper, we proposed a novel neural collaborative filtering method based on transfer learning, which can evaluate the QoS with few historical data by evaluating the other different QoS properties with rich historical data. We conduct our experiments on a large real-world dataset, the QoS values of which are obtained from 339 clients evaluating on the other 5825 clients. The comprehensive experimental studies show that our approach offers higher prediction accuracy than the traditional collaborative filtering approaches.

National Natural Science Foundation of China

61602399

61502410

1. Introduction

In recent years, video content accounts for a large proportion of global Internet consumption. Video steaming is gradually becoming the most attractive service [1–3]. However, Internet does not provide any quality of service guarantees to video content delivery. To expand the server capacity and reduce the video streaming bandwidth, P2P technologies are widely adopted by many content delivery systems [4–7]. In a P2P network, a peer not only downloads the media data from the network but also uploads the download data to other clients in the same network. To get a better user experience of watching videos, each client (or node) in the P2P network should select some other nodes as its neighbors in terms of the quality of service (QoS) for this client [8–10]. For example, a client might prefer to select nodes with high bandwidth. Due to the different locations and network conditions, different clients might have different QoS experience for the same node. To get the neighbors with the best QoS, one might want to evaluate the QoS of all the other nodes for each client. Unfortunately, the video streaming P2P network usually includes an extremely large number of users, and evaluating the QoS of all the nodes is time-consuming and resource-consuming.

An attractive way is that we can predict the QoS value of a node by taking advantage of the past usage experiences of a small number of the other clients who have evaluated this node. This refers to a famous technology, collaborative filtering (CF), which has been extremely studied in recommender systems [11–13]. With the help of CF method, each client only needs to know a small number of the real QoS values of the other nodes to select neighbors. The core idea is that if two clients have similar evaluation values of a specific QoS for some known nodes, they might also have similar QoS evaluation values for the other unknown nodes.

However, the neighbor selection policy might need to be changed to improve the quality of video content delivery. If the new policy uses the new QoS property to select neighbors, but the historical user experiences include very few data of this new QoS property, CF methods would incur severe overfitting issues, and then each client might get worse neighbor recommendation list. Transfer learning aims to adapt a model trained in a source domain with rich labeled data for use in a target domain with less labeled data, where the source and target domain are usually related but under different distributions [14–16]. Recently, deep neural networks have yielded remarkable success on many applications, especially on the computer vision, speech recognition, and natural language processing. Deep neural networks are powerful for learning general and transferable features. There are two major transfer learning scenarios, fine-tuning the pretrained network or treating the pretrained network as a fixed feature exactor. Instead of random initialization, we can initialize the network with a pretrained network, or we can freeze the weights of some layers of the network [17–19].

Unlike many supervised transfer learning tasks, we cannot simply fine-tune or freeze the weights of the network. The only information about the nodes in the video streaming P2P network is their identifiers (IDs) and the QoS evaluation historical experience. There is no raw feature for each node, and we need to lean abstract features for the nodes using embedding. Freezing the embedding features seems unreasonable. Furthermore, different QoS properties have different value ranges, and fine-tuning will make the final weights differ greatly from the initial weights pretrained in the source domain. Due to the sparsity of target domain labeled data, fine-tuning too much would incur severe overfitting problem.

In this paper we proposed a novel neural style collaborative filtering method, DTCF (Deep Transfer Collaborative Filtering). We can first train the model using the QoS evaluation data in the source domain and then adapt the model in the target domain with different QoS property. The core idea is that we only use the weights of first several layers to initialize the same layers of the model in the target domain, and randomly initialize the remaining layers. To control the degree of fine-tuning, we integrate the maximum mean discrepancy (MMD) measurement into the loss function [20–22]. The main contributions of our work are as follows:(i)

We propose a novel neural collaborative filtering model for QoS prediction using transfer learning technology.

(ii)

We provide a novel interaction layer to represent the relationship between latent embedding factors of the nodes.

(iii)

We adopt partial fine-tuning and MMD measurement to train the target domain model to implement domain adapting.

The remainder of this paper is organized as follows: We introduce the related work in Section 2. Section 3 presents the design details of our method. Section 4 describes our experiments and Section 5 concludes this paper.

2. Related Work

Distributed user-generated videos delivery poses a new challenge to large-scale streaming systems. To stream live videos generated by users, many existing video streaming systems rely on a centralized network architecture [23–25]. Even these streaming systems use Content Delivery Network (CDN) for video delivery, such a kind of solution is not cost-effective [26–28]. The unit price of content delivery over the Internet has dramatically decreased in recent years. However, there are higher requirements in terms of resolution, frame rate, or bitrate than before. Therefore, the amount of bandwidth consumed per user has grown at a faster rate. To reduce the bandwidth or the costs and improve the user experience, the P2P architectures can be adopted instead.

Collaborative filtering is a rational QoS prediction technology to select neighbors for each client in the P2P video streaming network [29–31]. To select the best neighbors with high delivery quality for the clients, CF should predict the QoS values between the clients and then select the top k best neighbors in terms of the QoS values. Each client only knows partial information about the QoS values for all the nodes in the network. Memory-based CF methods are some kinds of generalized k-nearest-neighbors (KNN) algorithms [32, 33], which have two types of models: user-based and item-based. Model-based CF methods are more popular, which act like generalized regression or classification algorithms, but they deal with abstract features not concrete or raw features. Among many model-based CF methods, matrix factorization has become the most popular technology to handle such kind of issues [34–40]. Probabilistic Matrix Factorization (PMF) model considers that the QoS values obey a normal Gaussian distribution, and the latent factors should be learned from zero-mean Gaussian distribution [41]. Nonnegative matrix factorization (NMF) can learn the nonnegative latent factors for the users or items, but it usually deals with the implicit feedback [42–44].

However, even if matrix factorization CF algorithms have obtained remarkable success, they have difficulty in dealing with cross-domain learning tasks if the output values of the source and target domain have different ranges. Deep neural networks can easily learn general and transferable features. More and more cross-domain applications adopt deep learning technologies and have yielded remarkable performance [45–47]. However, the exploration of deep neural networks on recommender systems or QoS prediction has received relatively less attraction. Recently, some studies have proposed some deep learning-based collaborative filtering models. Two impressive technologies are Google’s Wide & Deep [48] and Microsoft’s Deep Crossing [49]. The input of these models is side information, not the interaction between the users and items. Neural Collaborative Filtering (NCF) models are designed purely for user and item interactions [50]. However, none of them are designed for cross-domain QoS prediction.

3. DTCF Model

For the cross-domain QoS prediction in the video streaming P2P network, we are given a source domain Ds={xis,xjs,rijs}i≠j with ns examples, which is characterized by the probability distribution p and a target domain Dt={xkt,xlt,rklt}k≠l with nt examples which is characterized by the probability q. Usually the size of examples in the target domain is extremely small, ns≻nt. Our work aims to build a deep neural network to learn transferable features that bridge these two domains’ discrepancy.

3.1. Model Architecture Overview

We propose a novel neural architecture, outlined in Figure 1. The source domain and the target domain share the same network architecture. The input of the model is the identifier number of the nodes. For example, if size of nodes in the P2P network is n, the ID of each node is an integer number from 1 to n. The output of the mode is the QoS value that the node xi evaluates on the node xj.

Figure 1

DTCF architecture.

Since we do not use any concrete feature for each node, we need to learn abstract features for them. Here, we use embedding layer to learn a continuous latent vector/factor u for each node. The details of designing embedding layers are described in Section 3.1.

If we get two latent vectors for xi and xj, ui and uj, one might expect that we should concatenate these two vectors and then use affine function WuiujT+b to transform the latent vectors into the input of the other hidden layer above. However, there is no interactive action between the latent factors, but only weighted summation of elements of the vectors. Some studies use the dot product of the vectors to represent the interaction, which is described as follows. (1)ui⊙uj=ui,1×uj,1,ui,2×uj,2,⋯,ui,d×uj,d

Unfortunately, it is too simple to completely represent the complex interaction between nodes. In this paper, we propose a novel interaction layer to tackle this problem, which has powerful representation capacity. We will give the design details in Section 3.2.

Above the interaction layer, we use ReLU as the hidden layer. We might need multiple ReLU layers. The ReLU activation function is as follows. (2)zo=Wai+bao=0,zo

Finally, we use a fully connected layer to generate the output. When training the model in the source domain, we use the regression loss. We then use the all the layers of the pretrained model but the last FC layer to construct the model for target domain. The weights of these layers are kept as the initialized weights of the target domain model, but the final FC layer is initialized randomly. To avoid the overadaptation problem, we use both the domain loss and regression loss to train the target domain model. We will describe how to design the domain loss in Section 3.3.

3.2. Embedding Layer

Since we can assign a unique integer number as the identifier for each node in the network, we can use a one-hot vector to represent the identifier. If we have at most n nodes in the network, the ith nodes can be expressed as follows. (3)vi=vi,1,vi,2,…,vi,nvi,r=1,if r=i, 1≤r≤n0,if r≠i, r≤l≤nOur embedding layer is defined as follows: (4)ui=Wviwhere W is a d×n matrix. Expanding the formulas Wvi, we can see the following. (5)Wvi=w11w12⋯w1mw21w22⋯w2m⋮⋮⋯⋮wk1wk2⋯wkm00⋮1⋮⋮0w1iw2i⋮wii⋮wki

Therefore, ui is the ith column of matrix W. Since the node identifier number is transformed to a one-hot vector, the result of matrix multiplication is exactly a specific latent vector for each node. This weight matrix is jointly trained with the other parameters of the whole network.

3.3. Interaction Layer

There are two inputs of the interaction layer, ui and uj. Suppose any single vector is a column vector, and concatenating the two inputs will get a longer vector. This concatenation vector will be transformed to another vector, encoding interactive information between these two inputs. The transformation process is outlined in Figure 2.

Figure 2

Interaction layer.

Suppose the output of interaction layer is a vector h, the length of which is k. The sth element of the vector h is defined as follows. (6)hc=uiTujTWsuiuj+bs

If the length of ui is d, Ws is a 2d×2d square matrix. hc is a scalar, the value of which is determined by the matrix Ws and the bias bs. If the length of h is k, we need k weight matrices and biases.

h c include all the possible interaction relationships between ui and uj. Denote UT=uiTujT=[U1,U2,…,U2k], and we can see that hc=∑wsijUiUj+bs, where wsij is the element at the ith row and the jth column in the matrix Ws

3.4. Domain Loss

The output of the last ReLU layer of the model in the source domain is denoted as hs, and the output of the last ReLU layer of the model in the target domain is denoted as ht. If we want to avoid the overadaptation problem, one possible way is to minimize the differences between the distributions p=Pr(xis,xjs,hs(xis,xjs)) and q=Pr(xit,xjt,ht(xit,xjt)), where xis=xit and xjs=xjt.

Let (X,D) be a metric space, and xis,xjs,hs(xis,xjs)∈X, xit,xjt,ht(xit,xjt)∈X. Let F be a class of functions f: X→R, and the Maximum Mean Discrepancy (MMD) is as [22] (7)MMDF,p,q=supf∈FEs~pfs-Et~qftwhere s=xis,xjs,hs(xis,xjs)∈S and t=xit,xjt,ht(xit,xjt)∈T.

Denote xis,xjs∈Sx and xit,xjt∈Tx. The biased empirical estimated of the MMD is defined as follows. (8)MMDF,S,T=supf∈F1Sx∩Tx∑si,ti∈Sx∩Txfsi-fti

If the function class F is too large, it is not practical to work with this rich function class in the finite sample setting. A rational choice of the function class is a universal reproducing kernel Hilbert space H, named universal RKHS. Therefore, we have that f(x)=f,ϕxH, where f∈H. The kernel function k(x,y) is equal to ϕ(x),ϕ(y).

Denote f=ϕ(a)=k(a,·), and we can get the mean embedding of the distribution p; that is, μp(a)=μp,k(a,·)=Exk(a,x). From [22], we can obtain the following. (9)MMDF,p,q2=supfH≤1Es~pfs-Et~qft2=supfH≤1μp-μq,fH2=μp-μqH2=Es,s′ks,s′-2Es,tks,t+Et,t′kt,t′

Similarly, the empirical estimate can be defined now as follows. (10)MMDF,S,T2=1L2∑i=1L∑j≠iLksi,sj+1L2∑i=1L∑j≠iLkti,tj-2L2∑i=1L∑j=1Lksi,tj

In this paper, we use the empirical estimate of MMD2 as the domain loss. What we need to do is to select suitable universal kernel function. Here, we adopt Gaussian kernel function, which is defined as follows. (11)kx,y=exp⁡-x-y22δ2

3.5. Algorithm

The total loss of target domain includes regression loss and MMD loss. We use the minibatch to train the model. Only a small group of examples are used to compute the loss per training iteration. Denote the set of the minibatch examples in the source domain Ms and the set of the minibatch examples in the target domain Mt. The loss function of the model in the source domain is defined as follows. (12)LMs=1Ms∑r^ij-rij2However, the loss function of the model in the target domain is defined as (13)LMt=1Mt∑r^ij-rij2+MMDF,S,T2where Sx∩Tx=Mt.

To optimize our model, we need to compute the gradient of each weight. For any weigh related to both of the regression and domain loss, its gradient is computed as follows. (14)∂LMt∂w=1Mt∑∂r^ij-rij2∂w+∂MMDF,S,T2∂w=2Mt∑r^ij-rij∂r^ij∂w+2MMDF,S,T∂MMDF,S,T∂w(15)∂MMDF,S,T∂w=-2Mt∑i≠jexp⁡-hit-hjt22δ2hit-hjs∂hit∂w-2Mt∑i≠jexp⁡-hit-hjt22δ2hit-hjs∂hjt∂w+4Mt∑exp⁡-his-hjt22δ2his-hjt∂hjt∂w

Note that his is not used for computing gradients, because we only train the target domain model after pretraining in the source domain. The training process is described as follows:(i)

We first train the model of the source domain using the loss function L(Ms). The gradient of each weigh is computed according to formula (13).

(ii)

After training, we use the weights of this model to initialize the model in the target domain except the weights of the last FC layer. The last FC layer of the model of the target domain is initialized randomly.

(iii)

While training the model of the target domain, we use the loss function L(Mt).

(iv)

For each training iteration, we randomly select examples in the dataset, and compute the gradient according to formulas (14) and (15).

(v)

We use ADAM (Adaptive Moment Estimation) as the optimizer.

4. Experimental Results 4.1. Dataset and Evaluation Metrics

We conduct our experiments on a publicly large accessible dataset, WS-DREAM dataset#1, obtained from 339 hosts doing QoS evaluation on the other 5825 hosts. There are two types of QoS properties in this dataset: response time and throughput. Here, we use the response time as the source domain, and the throughput as the target domain.

For the source domain, we randomly extract 30% (density) of the data as the source training set. For the target domain, we construct 5 different training sets with different density of 0.5%, 1%, 1.5%, 2%, 2.5%, and 3%. Consequently, the remaining data is the test set.

We adopt a common evaluation metric: Mean Absolute Error (MAE), which is widely employed to measure the QoS prediction quality. (16)MAE=∑i,j,ri,j∈QEri,j-r^i,jQE

4.2. Performance Comparison

We compare our methods with some traditional collaborative filtering methods: UPCC, IPCC, UIPCC [34], and matrix factorization (MF). UPCC is a user-based CF method, which uses PCC (Pearson Correlation Coefficient) to calculate the similarity between users. IPCC is similar to UPCC, except that it calculates the similarity between items. UIPCC combines the advantages of these two methods by balancing the proportions of them in the final results. For UPCC, IPCC, and UIPCC, different tradeoff parameters k=5,10,15,20,25,30 (the parameters of top k similar users or services) are tried, and finally we choose k=10. For MF and DTCF, the sizes of latent factors are also set to 10. For DTCF, different hidden ReLU layers and different hidden unit sizes are tried. Here, the maximum number of hidden layers is limited to 5. We tested the batch size of [128, 256, 512, 1024], the learning rate of [0.0001,0.0005, 0.001, 0.005], and the training epoch of [10, 20, 30, 40, 50, 60, 70, 80]. The bandwidth δ is set to the median pairwise distance on the source training data.

We conduct 10 experiments for each model and each sparsity level and then average the prediction accuracy values.

The results are reported in Figures 3 and 4. We can make the following observations:(i)

As the sparsity level increases, the MAEs of all the models decrease.

(ii)

Our DTCF methods outperform the other traditional collaborative filtering methods, especially when the training set is extremely sparse.

(iii)

DTCF model has more weights that need to be trained than the other models, but it gets the best performance, which indicates that the relationship between nodes is very complex, and shallow models cannot capture these structures.

Figure 3

MAE with respect to density.

Figure 4

MAE comparison for each density.

(a)

Density=0.5%

(b)

Density=1%

(c)

Density=1.5%

(d)

Density=2%

(e)

Density=2.5%

(f)

Density=3%

Although shallow models are not easily overfitting when the target domain training dataset is extremely sparse, they cannot transfer rich information from the source domain. The deep models might easily incur overfitting problem, but they can learn common latent features from the source domain. To balance this dilemma, we need to control the degree of fine-tuning the deep model. This experiment shows that MMD domain loss is an efficient way of controlling the adapting degree.

4.3. Impact of the Network Depth

The network depth usually has important impact on the prediction performance. Here, the number of neurons of each ReLU is set to 128, and we add the number of ReLU layers from 1 to 6 to see how the MAE values change. The experimental result is outlined in Figure 5, from which we can see the following:(i)

Adding more ReLU layers can get better prediction performance, but when the depth exceeds a limited value, the performance starts to become worse.

(ii)

Although adding more ReLU layers can improve the performance, it seems that enlarging the size of the training data would be more helpful.

(iii)

Sometimes, adding more layers would not improve the performance anymore, but it also does not get worse prediction performance. This indicates that deep neural network has some kind of regularization property.

Figure 5

MAE with respect to number of ReLU layers.

Actually, if the training dataset is very large, adding more layers usually does not incur overfitting problems, but for the cross-domain learning, the target domain has very little data, so the network depth needs control.

4.4. Impact of the Gaussian Kernel Bandwidth

Another hyperparameter that we need to determine is the Gaussian kernel bandwidth. By default, it is set to the median pairwise distance on the source training data. We scale the default value from 0.25 to 2.0, and the experimental result is outlined in Figure 6.(i)

Obviously, the default value is a rational choice, and scaling too small or too large would get worse prediction performance.

(ii)

If the bandwidth is too large, the kernel will be approximately equal to 1, and the nodes would look the same. We cannot propose personal recommendation for them.

(iii)

If the bandwidth is too small, the kernel will be approximately equal to 0, and the nodes cannot find similar neighbors to follow their past experiences.

Figure 6

MAE with respect to Gaussian kernel bandwidth scale.

5. Conclusion

Selecting neighbors in terms of the QoS is an effective way of providing high quality contents in video streaming P2P networks. Due to the heterogeneous network conditions, the QoS between any pairs of nodes is different. However, evaluating the QoS of all the nodes for each user is resource-consuming. An attractive way is to adopt collaborative filtering technologies, which use only a small amount of past usage experience.

Unfortunately, the video content providers might often choose different QoS properties to select neighbors. Traditional CF methods cannot solve the cross-domain QoS prediction problem. This paper proposed a novel neural style CF method based on transfer learning. We first outlined our model architecture and then introduced the details of important parts of this model. To avoid the overadaptation problem, we combined domain loss and prediction loss together to train the model of the target domain. We adopted MMD distance as our domain loss, and we also provide its principle and how to compute the gradient. Finally, we conducted our experiments on a real-world public dataset. The experimental results show that our DTCF model can outperform the other models for cross-domain QoS prediction.

Data Availability

The WS-Dream data used to support the finding of this study is owned by a third party, which is an open dataset and is deposited in “https://github.com/wsdream/wsdream-dataset".

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work are supported by the National Nature Science Foundation of China (No. 61602399 and No. 61502410).

Mada

B. E.

Bagaa

Taleb

Efficient Transcoding and Streaming Mechanism in Multiple Cloud Domains

Proceedings of the 2017 IEEE Global Communications Conference (GLOBECOM 2017)

December 2017

Singapore

1 6

10.1109/GLOCOM.2017.8254123

Zheng

Zhang

Gao

A Bandwidth Variation Pattern-Differentiated Rate Adaptation for HTTP Adaptive Streaming over an LTE Cellular Network

IEEE Access 2017 6 9554 9569

2-s2.0-85040070775

10.1109/ACCESS.2017.2788057

Chen

Xiao

Shu

Liu

Efficient Video Pricing and Caching in Heterogeneous Networks

IEEE Transactions on Vehicular Technology 2016 65 10 8744 8751

2-s2.0-85018910804

10.1109/TVT.2015.2511806

Cheng

Liu

Yang

Sun

A Blockchain based Truthful Incentive Mechanism for Distributed P2P Applications

IEEE Access 2018 6 27324 27335

2-s2.0-85044746411

Lin

S.-H.

Pal

Wang

B.-C.

Golubchik

On market-driven hybrid-P2P video streaming

IEEE Transactions on Multimedia 2017 19 5 984 998

2-s2.0-85018177138

10.1109/TMM.2016.2644868

Huang

Gao

Kong

An incentive scheme based on bitrate adaptation for cloud-assisted P2P video-on-demand streaming systems

Proceedings of the 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA)

April 2018

Chengdu

404 408

10.1109/ICCCBDA.2018.8386550

Junchen

Shijie

Sekar

Zhang

Pytheas: Enabling data-driven quality of experience optimization using group based exploration-exploitation

Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation

2017

393 406

Wang

Zhang

Yang

Qos-Awareness variable neighbor selection for mesh-based P2P live streaming system

Proceedings of the 2013 IEEE 3rd International Conference on Information Science and Technology, ICIST 2013

March 2013

China

1197 1201

2-s2.0-84899003679

Liem

A. T.

Hwang

I.-S.

Nikoukar

Yang

C.-Z.

Ab-Rahman

M. S.

C.-H.

P2P Live-Streaming Application-Aware Architecture for QoS Enhancement in the EPON

IEEE Systems Journal 2018 12 1 648 658

2-s2.0-84961847694

10.1109/JSYST.2016.2535539

Bai

Zaman

Leung

V. C. M.

A Decentralized Trustworthy Context and QoS-Aware Service Discovery Framework for the Internet of Things

IEEE Access 2017 5 19154 19166

2-s2.0-85030695321

10.1109/ACCESS.2017.2756446

Wang

Yin

Quan

Min

Enabling collaborative edge computing for software defined vehicular networks

IEEE Network 2018 99 1 6

10.1109/MNET.2018.1700364

Yang

Guo

Liu

Steck

A survey of collaborative filtering based social recommender systems

Computer Communications 2014 41 1 10

2-s2.0-84894532448

10.1016/j.comcom.2013.06.009

Wei

Chen

Zhou

Tang

Collaborative filtering and deep learning based recommendation system for cold start items

Expert Systems with Applications 2017 69 29 39

2-s2.0-84993997491

Peng

Tian

Xiang

Wang

Pontil

Huang

Joint Semantic and Latent Attribute Modelling for Cross-Class Transfer Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 40 7 1625 1638

2-s2.0-85023159886

Chang

Han

Zhong

Snijders

A. M.

Mao

J.-H.

Unsupervised Transfer Learning via Multi-Scale Convolutional Sparse Coding for Biomedical Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 40 5 1182 1194

2-s2.0-85044858639

10.1109/TPAMI.2017.2656884

Guo

Ding

Han

Gao

Zero-shot learning with transferred samples

IEEE Transactions on Image Processing 2017 26 7 3277 3290

10.1109/TIP.2017.2696747

MR3653311

2-s2.0-85021715804

Wang

Zuluaga

M. A.

Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning

IEEE Transactions on Medical Imaging 2018 37 7 1562 1573

Zhou

Lü

Lin

Yang

Sample Selected Extreme Learning Machine Based Intrusion Detection in Fog Computing and MEC

Wireless Communications and Mobile Computing 2018 2018 10

7472095

10.1155/2018/7472095

Yanai

Kawano

Food image recognition using deep convolutional network with pre-training and fine-tuning

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2015

July 2015

Turin, Italy

1 6

2-s2.0-84945586249

Gangeh

M. J.

Tadayyon

Sannachi

Sadeghi-Naini

Tran

W. T.

Czarnota

G. J.

Computer Aided Theragnosis Using Quantitative Ultrasound Spectroscopy and Maximum Mean Discrepancy in Locally Advanced Breast Cancer

IEEE Transactions on Medical Imaging 2016 35 3 778 790

2-s2.0-84963718165

10.1109/TMI.2015.2495246

Yan

Ding

Wang

Zuo

Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation

Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017

July 2017

USA

945 954

2-s2.0-85043992179

Gretton

Borgwardt

K. M.

Rasch

M. J.

A kernel two-sample test

Journal of Machine Learning Research (JMLR) 2012 13 723 773

MR2913716

Online. 2018. http://www.twitch.tv/

Online. 2018. http://www.ustream.tv

Online. 2018. http://www.livestream.com/

Al-Abbasi

A. O.

Aggarwal

EdgeCache: An optimized algorithm for CDN-based over-the-top video streaming services

Proceedings of the IEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)

April 2018

Honolulu, HI, USA

202 207

10.1109/INFCOMW.2018.8407016

Denis

Matias

Juergen

Service migration from cloud to multi-tier fog nodes for multimedia dissemination with QoE support

Sensors 2018 18 2

2-s2.0-85041043446

Dernbach

Taft

Kurose

Weinsberg

Diot

Ashkan

Cache content-selection policies for streaming video services

Proceedings of the 35th Annual IEEE International Conference on Computer Communications, IEEE INFOCOM 2016

April 2016

USA

2-s2.0-84983354505

Cheng

Chen

Collaborative Filtering Service Recommendation Based on a Novel Similarity Computation Method

IEEE Transactions on Services Computing 2017 10 3 352 365

2-s2.0-85027528059

10.1109/TSC.2015.2479228

Liu

Tang

Zheng

Liu

Lyu

Location-aware and personalized collaborative filtering for web service recommendation

IEEE Transactions on Services Computing 2016 10 3 686 699

10.1109/TSC.2015.2433251

Margaris

Vassilakis

Georgiadis

An integrated framework for adapting WS-BPEL scenario execution using QoS and collaborative filtering techniques

Science of Computer Programming 2015 98 707 734

2-s2.0-84937841913

10.1016/j.scico.2014.10.007

Bellogín

Castells

Cantador

Neighbor Selection and Weighting in User-Based Collaborative Filtering: A Performance Prediction Approach

ACM Transactions on the Web (TWEB) 2014 8 2 1 30

10.1145/2579993

Jia

Yang

Gao

Chen

User-based collaborative filtering for tourist attraction recommendations

Proceedings of the IEEE International Conference on Computational Intelligence and Communication Technology (CICT '15)

February 2015

22 25

10.1109/cict.2015.20

2-s2.0-84928162814

Zheng

Lyu

M. R.

King

Collaborative web service Qos prediction via neighborhood integrated matrix factorization

IEEE Transactions on Services Computing 2013 6 3 289 299

10.1109/tsc.2011.59

2-s2.0-84883778327

Zhu

Zheng

Lyu

M. R.

Online QoS Prediction for Runtime Service Adaptation via Adaptive Matrix Factorization

IEEE Transactions on Parallel and Distributed Systems 2017 28 10 2911 2924

2-s2.0-85030027575

10.1109/TPDS.2017.2700796

Zhu

Niu

Robust web service recommendation via quantile matrix factorization

Proceedings of the 2017 IEEE Conference on Computer Communications, INFOCOM 2017

May 2017

USA

2-s2.0-85034103467

Zhang

Zheng

Lyu

M. R.

Exploring latent features for memory-based QoS prediction in cloud computing

Proceedings of the 30th IEEE International Symposium on Reliable Distributed Systems (SRDS '11)

October 2011

Madrid, Spain

IEEE

1 10

10.1109/srds.2011.10

2-s2.0-83155173547

Liu

Yin

Personalized QoS prediction for web services using latent factor models

Proceedings of the 11th IEEE International Conference on Services Computing, (SCC '14)

July 2014

107 114

10.1109/SCC.2014.23

2-s2.0-84919647137

Koren

Factorization meets the neighborhood: a multifaceted collaborative filtering model

Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08)

August 2008

New York, NY, USA

426 434

10.1145/1401890.1401944

2-s2.0-65449121157

Yin

Deng

An Extended Matrix Factorization Approach for QoS Prediction in Service Selection

Proceedings of the 2012 9th IEEE International Conference on Services Computing (SCC)

June 2012

Honolulu, HI, USA

162 169

10.1109/SCC.2012.36

Salakhutdinov

Mnih

Probabilistic Matrix Factorization

Proceedings of the Advances in Neural Information Processing Systems (NIPS’07)

2007

1257 1264

Daniel

Lee

Sebastian

Algorithms for Non-negative Matrix Factorization

Proceedings of the Advances in Neural Information Processing Systems (NIPS’00)

2000

556 562

Luo

Zhou

Xia

Zhu

An efficient non-negative matrix-factorization-based approach to collaborative filtering for recommender systems

IEEE Transactions on Industrial Informatics 2014 10 2 1273 1284

10.1109/tii.2014.2308433

2-s2.0-84900836031

Hernando

Bobadilla

Ortega

A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model

Knowledge-Based Systems 2016 97 188 202

2-s2.0-84955572606

10.1016/j.knosys.2015.12.018

Shin

Roth

H. R.

Gao

Nogues

Yao

Mollura

Summers

R. M.

Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning

IEEE Transactions on Medical Imaging 2016 35 5 1285 1298

10.1109/tmi.2016.2528162

Oquab

Bottou

Laptev

Sivic

Learning and transferring mid-level image representations using convolutional neural networks

Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR '14)

2015

Columbus, OH, USA

IEEE

1717 1724

10.1109/cvpr.2014.222

2-s2.0-84911449395

Chen

Dou

Cheng

Qin

Heng

Automatic fetal ultrasound standard plane detection using knowledge transferred recurrent neural networks

Proceedings of the International Conference On Medical Image Computing & Computer Assisted Intervention

2015

507 514

Cheng

H.-T.

Levent

Jeremiah

Wide & Deep Learning for Recommender Systems

Proceedings of the1st Workshop on Deep Learning for Recommender Systems

2016

7 20

Shan

Hoens

T. R.

Jiao

Wang

Mao

J. C.

Deep crossing: Web-scale modeling without manually crafted combinatorial features

Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016

August 2016

USA

255 262

2-s2.0-84984982159

Liao

Zhang

Nie

Chua

Neural Collaborative Filtering

Proceedings of the 26th International Conference

April 2017

Perth, Australia

173 182

10.1145/3038912.3052569