A Novel Mobile Video Community Discovery Scheme Using Ontology-Based Semantical Interest Capture

Leveraging network virtualization technologies, the community-based video systems rely on themeasurement of common interests to define and steady relationship between community members, which promotes video sharing performance and improves scalability community structure. In this paper, we propose a novel mobile Video Community discovery scheme using ontologybased semantical interest capture (VCOSI). An ontology-based semantical extension approach is proposed, which describes video content and measures video similarity according to video key word selection methods. In order to reduce the calculation load of video similarity, VCOSI designs a prefix-filtering-based estimation algorithm to decrease energy consumption of mobile nodes. VCOSI further proposes a member relationship estimate method to construct scalable and resilient node communities, which promotes video sharing capacity of video systems with the flexible and economic community maintenance. Extensive tests show how VCOSI obtains better performance results in comparison with other state-of-the-art solutions.


Introduction
The video streaming services are very popular multimedia applications in the Internet (e.g., the video traffic is now more than two-thirds of the network traffic) [1].The convenient access of video content via mobile smart devices further promotes the development of video applications with the help of mobile networking technologies and increasing wireless bandwidth [2].The huge traffic demand brings heavy load for the video systems, which requires scalable architecture and high-efficiency resource sharing to support largescale deployment.Peer-to-Peer (P2P)/Mobile Peer-to-Peer (MP2P) technologies construct virtual networks in terms of resource demand of (mobile) users to manage and distribute resources [3][4][5][6][7].The continuous increase in the number of nodes in video systems not only triggers huge traffic demand for video content but also increases the maintenance cost of overlay networks due to dynamic playback state of nodes.However, the traditional P2P-based video systems make use of the request-response relationship between nodes to construct overlay networks.The fragile relationship between nodes is one of the main reasons for the frequent disconnection of logical links between them.Because the predefined relationship neglects the investigation for the resource supply and demand, it is difficult to achieve high-efficiency resource sharing and low-cost maintenance for overlay networks.This leads to the low quality of service (QoS) such as long startup delay and low system scalability.
Inspired by the investigation of relationship between nodes in social networks, the virtual community-based video systems rely on the measurement results of common interests for the video content to define relationship between community members [8][9][10][11].As Figure 1 shows, the whole overlay network is divided into multiple communities composed of nodes with similar interests.The similar interests not only steady the logical link between community members to reduce maintenance cost for overlay networks, but also achieve efficient resource sharing to promote QoS of video systems.A key issue is how to accurately estimate the similarity of interests between nodes.The videos watched by users can reflect user interests; namely, the two users which watch the same or similar videos can be considered as having common interests.The video similarity estimation in the traditional methods only investigates file name of videos in historical playback traces of users.The file name difficultly reflects the whole video content, which cannot ensure estimation accuracy of interest similarity.The low accuracy of estimation results leads to fragile logical link between community members, which still triggers the frequent construction of community structure.
The description of video content is a key factor for the measurement accuracy of video similarity, which usually includes video name, title, type, actors, director, and abstract.The expanded video information is described by the short text.However, the noise words and valid data sparseness in the short text bring negative influence for the estimation accuracy, such as advertisement words in video abstract.A key issue is how to build a rich semantic-based description space for video content to efficiently promote estimation accuracy of video similarity.On the other hand, the expanded description brings more video information, which constructs the high-dimensional similarity measurement space composed of massive feature words.The video similarity calculation with high time complexity consumes large amount of energy of mobile nodes.Another key issue is how to design a light-duty estimation method to reduce calculation load of mobile nodes.
In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI).VCOSI employs an ontology-based semantical extension approach to describe video content and adopts three classification methods to select key words in the extended description of video content.VCOSI designs an estimation method to calculate similarity of video content by making use of the selected key words.In order to reduce calculation complexity of similarity, VCOSI further proposes a prefix-filtering-based estimation algorithm to reduce the calculation load of mobile nodes.Based on the estimation results of video similarity, VCOSI measures closeness levels of common interests between nodes and builds scalable and resilient community structure, which promotes video sharing capacity of video systems.Simulation results show how VCOSI achieves much better performance results in comparison with other state-of-the-art solutions.

Related Work
The traditional P2P/MP2P-based video systems mainly employ structured and unstructured overlay networks to distribute video resources.The systems based on structured overlay networks, such as tree and Chord, can achieve fast resource location but need to consume large number of resources of bandwidth and computation to maintain overlay networks due to dynamic node state, which results in low scalability.The systems based on structured overlay networks do not handle real-time variation of node state, which obtains high scalability.However, the flooding-based resource lookup method wastes massive network bandwidth, which leads to network congestion and high startup delay.
The community-based video systems group the nodes into multiple communities in terms of closeness levels of relationship between them.Each community undertakes the tasks of state maintenance of intracommunity members; namely, the maintenance cost for the whole overlay networks is distributed into multiple communities, which improves system scalability.Moreover, the tight relationship enables logical links between community members to become more stable, which further reduces maintenance cost of community structure.On the other hand, the community members store similar video resources and request similar video content with high probability in the future, so the video request of members is quickly responded by other intracommunity members, which reduces lookup delay.
SocialTube measures relationship between nodes by the investigation of number of watched videos [8].The more the watched same videos are, the more similar the interests between nodes are.The nodes with similar interests form a community.The source nodes push the interested videos to the community members.However, it is difficult to obtain accurate measurement results of similarity by making use of number of watched videos to estimate interest similarity.This results in the low video push success rate and fragile logical link between nodes.SPOON also relies on investigating file name vector in historical playback traces to capture common interests between nodes [9].The mobile nodes with similar interests are grouped into communities.The designed role assignment method distributes the maintenance cost for the community structure into multiple community members, which balances the load of mobile nodes and improves the community scalability.However, the interest similarity measurement based on file name difficultly ensures the estimation accuracy of common interests between nodes.
In order to address the problems of low estimation accuracy in traditional methods, some common interest capture methods expand the description of video content (e.g., video name, title, type, actors, director, and abstract).The expanded description enriches video information and is denoted by the short text, which increases the measurement accuracy of similarity between videos.However, the short text includes large number of noise words, which severely influences measurement accuracy of video similarity.The existing studies make use of semantic dictionary, feature extension, and topic model to address the problems of noise words.
(1) Semantic Dictionaries.The methods based on semantic dictionaries, such as WordNet and HowNet, can accurately estimate similarity of word pairs by the conversion from short text to word pairs.Mihalcea et al. [12] considered the weighted average values of similarities among word pairs as the similarities of short text.Li et al. [13] combine semantic and sequence similarity to calculate similarity of short text.Based on the work in [13], Islam and Inkpen [14] make use of the weighted sum of string similarity, semantic similarity, and sequence similarity to calculate similarity of short text.However, the semantic dictionary always is incomplete due to the continuous occurrence of new words, which cannot ensure high-accuracy similarity measurement results of short text.
(2) Feature Extension.Sahami and Heilman employed a similarity kernel function to estimate the short text similarity by making use of the search engine to extend features of short text [15].Wang et al. [16] and Yuan [17] proposed a mining algorithm based on association rule to extract association relationship between features included in training and testing sets, which further obtains the extended features corresponding to the words.Genc et al. built the mapping relationship between short text and Wikipedia pages to calculate similarities of short text [18].Wang et al. [19] proposed a unified framework to expand short texts by making use of the convolutional neural network, which can address the problems of sparsity of short texts and semantic sensitivity.However, the above methods still do not avoid the negative influence caused by noisy words.
(3) Topic Model.Quan et al. make use of topic model based on the Latent Dirichlet Allocation (LDA) to calculate similarity of short text, which improves measurement accuracy based on the vector space model (VSM) model [20].Phan et al. employed a similarity estimation method of short text based on the probability distribution of document topic [21].Zhang and Zhong collected large-scale external data to build the topic model according to the LDA, which enables word topics to enrich feature representations of short text [22].Vo and Ock employed a LDA-based method to discover hidden topic from universal datasets (e.g., Computer Science Bibliography), Lecture Notes in Computer Science book series (LNCS), and Wikipedia [23].
In addition to the above methods, there are many studies related to similarity measurement of short text, such as deep leaning based methods [24][25][26][27], Earth Mover's Distance-(EMD-) based method [28] and multilevel sentence similarity calculation [29].

Video Similarity Measure Based on Semantic Feature
Extension.The noisy words in short text corresponding to video information do not reflect real video content, which results in severely negative influence for the estimation accuracy of video similarity.Therefore, it is essential to eliminate noisy words and select key words in short texts corresponding to video content (the key words can embody real video content).The video information mainly includes title, actor, director, and tag and is denoted by short text.For instance, if the two videos have many cooccurrence words in title, actor, and director of videos, they may have similar content (e.g., the TV series).The tags are the artificial sign to represent valuable information such as category and plot themes of videos.The above video information is used to recognize the key words; namely, , , , and  are defined as the features of key words.Moreover, we also consider position and frequency of word as the features of key words.For instance, if a word has high frequency of occurrence in title and tag of video, it may carry important information for video content and is considered as the key words.Therefore, , e, and  also are defined as the features of key words.Table 1 lists name and description of features of key words.
The key words disperse in short text, so the selection of key words is considered as the classification of words.In order to accurately identify the key words, we choose three kinds of classification methods including decision tree, SVM, and Because KNN has the best accuracy rate among decision tree, SVM, and KNN in the range from 1 to 6, we use KNN based on the range from 1 to 6 to select the key words in the following experiments.Because KNN is a classical classification method, the classification process is no longer described in detail.
On the other hand, if the short text corresponding to video information only includes small number of valuable key words, the selected key words also cannot reflect the whole video content.We employ an ontology-based feature extension method to enrich key words because the vector space model does not deal with the synonym, hypernym, and hyponym in key words.The ontology is constructed by domain experts, which includes rich semantic information based on the specific background.The text label, comments, instances, properties, and relations between other concepts are used to denote a concept in the ontology.We make use of the information of ontology to extend the key words of video content.Let   be the set of key words of a video V  ; namely,   = { 1 ,  2 , . . .,   , . . .,   }.Let   (  ) denote the set of extended words of a key word   based on the built ontology information, which is defined as where  label (  ) is the set of literal features and includes key words of text label and comments of   ;  structure (  ) is the set of structure features and includes key words in ancestor of concept   ;  property (  ) is the set of property features and includes key words in the properties of concept   ;  instance (  ) is the set of instance features and includes key words in the instances of concept   .For example, a video, called "angry birds," has the four feature sets corresponding to the key word "birds"; namely,  label (birds) = {bird, warm-blooded, egg-laying, vertebrates, characterized, by, feathers, and, forelimbs, modified, as, wings},  structure (birds) = {entity, physical, entity, object, whole, living, thing, organism, animal, chordate, vertebrate},  property (birds) = {warm-blooded, egg-laying}, and  instance (birds) = {Red, Chuck, Bomb, Matilda, Mighty Eagle}.Obviously, the ontology-based semantic description of video content efficiently enriches key words to reduce the probability of erroneous judgement.
The extended words of all items in   form a new set   =   ( 1 ) ∪   ( 2 ), . . .,   ( −1 ) ∪   (  ).  can be used to measure the similarity between videos.The traditional Tversky similarity model relies on common and distinctive features between two objects to estimate similarity.However, the users may focus more on common features in the process of similarity estimation.Therefore, we improve the Tversky similarity model and make use of the set   to estimate the content similarity between any two videos V  and V  according to the following equation.
where  and  are the weight factors added by us, which are used to adjust common and different features.For instance, we assume that The similarity value of   (V  ) and   (V  ) based on the Jaccard similarity measure is very small; namely,   (V  ) and   (V  ) should be similar to some extent.Therefore, we use (2) to increase the similarity between   (V  ) and   (V  ) by the adjustment of  and .For instance, we assume that

Prefix-Filtering-Based Similarity Estimation Algorithm.
The extension of key words increases the available information for video content to improve measurement accuracy of video similarity.However, the increase in the extension scale of key words results in high computation complexity due to large number of duplicated comparisons in the process of similarity estimation.The high computation load consumes large amount of energy of mobile nodes.The prefix filtering technique based on the Jaccard similarity in [30] can efficiently reduce computation complexity.Because the video similarity Sim(V  , V  ) in ( 2) is different with the Jaccard similarity, we derive the relationship of the Jaccard similarity and Sim(V  , V  ) and convert Sim(V  , V  ) to the Jaccard similarity.Further, we design a bloom-filter-based duplicated comparison free method, which reduces the computation load by prevention of duplicated comparisons.

Prefix Filtering Based on Derived Jaccard Similarity Threshold
Lemma 1.
is the Jaccard similarity between V  and V  : where  is the similarity threshold.If the similarity (V  , V  ) between V  and V  is smaller than , V  and V  are not similar.
Proof.For simplicity, we still use , , and  to represent The similarity between V  and V  can be redescribed as Sim(V  , V  ) = /( +  + (1 − )).

Construction and Maintenance of Video Communities.
The mobile nodes joined into systems make use of the watched video similarity to calculate the similarity levels of interests.Let VT(  ) = {(V  ,   ), . . ., (V ℎ ,  ℎ )} and VT(  ) = {(V  ,   ), . . ., (V  ,   }) denote historical playback traces of   and   , respectively.V  and   are the video ID and time of playing video  for   .  and   exchange the information of VT(  ) and VT(  ) with each other and calculate the similarity of VT(  ) and VT(  ).For instance,   calculates similarity value (V  , V  ) between V  and V  and mean value   of playback time ratio   =   /  and   =   /  , namely, two-tuple   = ((V  , V  ),   ).  and   are watched time and length of V  , respectively;   and   are watched time and length of V  , respectively.  can obtain a two-tuple set ST of all videos in VT(  ) and VT(  ).We use the Least Square Method (LSM) [31] to estimate the correlation coefficient   of items in  according to the following equation: where  is the number of items in ST;   and   are video similarity value and average playback time ratio of items in ST, respectively;   and   are the mean value of similarity and playback time ratio among all videos in VT(  ) and VT(  ), respectively.  is considered as the interest similarity value between   and   .If   >   ,   and   have similar interest.We introduce the two discovery methods of community members, as follows.
(1) The nodes make use of the message exchange to find the nodes with similar interests in one-hop neighbor nodes.
For instance,   broadcasts the messages including the own playback trace to all one-hop neighbor nodes.If the neighbor nodes of   do not join any community, they estimate the interest similarity value with   .If there are similar interests between them, they return the acknowledgment messages to   .  and the neighbor nodes with similar interests form a community or the neighbor nodes join into the community corresponding to   .
(2) If   sends a request message to a supplier   of V  or receives a request message from   where the message includes the playback trace of   ,   calculates the interest similarity value with   .If   and   have similar interest and   does not join any community,   invites   to join the community to   by the message exchange.
If the target node   invited by   has joined other communities,   decides whether to join the community to   .Let   denote the interest similarity value between   and a corresponding node   in current community.If   >   ,   stays in current community; otherwise, if   <   ,   quits current community and joins new community to   .Obviously, the movement of nodes between communities leads to dynamic community structure and increases the load of community maintenance but also can be considered as a continuous optimization process of community structure.In other words, the nodes always seek the community which has the most similar interest value.
Except for the enter and departure of nodes, the community member role assignment and the community structure maintenance are very important for the video sharing performance.Due to the variation of community structure caused by the enter and departure of nodes, the community members need to maintain community structure and the maintenance load is assigned to the specific members.The real-time and random variation of community structure bring huge maintenance load.In order to ensure community scalability, the maintenance load should be distributed to multiple members.Because the members join communities by the invitation with each other, there are the logical links among members.Therefore, the community members form an undirected and connected graph  = (, ).By the collection of video lookup-related communication paths between nodes in , there is the member  ℎ which has the most occurrence frequency in all communication paths of a community.The high-frequency occurrence in communication paths means that  ℎ participates in most of message forwarding of video lookup. ℎ 's neighbor nodes in  become the broker nodes of community and  ℎ becomes the head node of all broker nodes.
Because all paths include  ℎ in , the paths are divided into multiple subpaths which do not have  ℎ .The neighbor nodes of  ℎ are responsible for maintaining the state of other members in the paths including them. ℎ acts as the community interface to contact other communities.If a neighbor node   of  ℎ quits the system or current community,  ℎ reselects a neighbor node of   as the broker member.If  ℎ leaves current system or current community, the member which has the highest occurrence frequency among all neighbor nodes in  ℎ becomes new head node.All broker nodes exchange the information of community members during a period of time.
When a node   in a community requests a video V  , it sends a request message to the contacted broker node   .If   is aware of the supplier cached V  ,   directly forwards the request message to the supplier.Otherwise, if   does not store the information of suppliers cached V  ,   forwards the request message to the head node  ℎ . ℎ makes use of the maintained interface information between communities to help   search V  .If   quits the system or joins other communities,   needs to send a message to inform   .  removes information of   from local member list.If a mobile node or a member in other communities joins current community, the inviter sends a message containing the information of new member to the corresponding broker node.The latter updates local member list.

Testing Scenarios of Video Similarity.
The common interest of users is the key factor for the tight levels between community members, which determines the stability of community structure.We firstly compare the measurement performance of video similarity between the proposed ontologybased semantical estimation method, Semantic Feature Extension based Similarity Measure (SFEbSM) and Semantic Feature Extension based Similarity Measure using Key Word Selection (SFEbSMKS), and the two classical methods, Jaccard Similarity Mesaure using FileName only (JVSMfn) and Jaccard Similarity Mesaure using all features (JVSMaf).
Because there is no existing benchmark for video similarity estimation, we construct an annotated dataset composed of 4000 videos selected from the internet.The attributes of videos include video title, video description, video type, tags, directors, and actors.We invite 100 student volunteers to estimate the similarity value for 5000 pairs of videos according to the video attributes where the similarity value range is between 0 and 1.The similarity value can be any real number with two decimal places between 0 and 1.Table 3 shows a part of evaluation results of volunteer for the 5000 video pairs.

Testing Scenarios of Video Sharing.
We make use of the estimation results of video similarity to build the communities and further compare the video sharing performance of VCOSI with a state-of-the-art solution SPOON [9].VCOSI and SPOON were modeled and implemented in NS-2.500 mobile nodes are deployed in a wireless mobile network whose area is set to 1000×1000 m 2 .The mobile speed range of nodes is in the range [1,20] m/s.The simulation time is set to 500 s.The signal range of mobile nodes is set to 200 m.The transmission protocol of video data is TCP and the wireless routing protocol is DSR.The bandwidth of mobile nodes and server is set to 10 Mb/s and 20 Mb/s, respectively.The server acts as initial provider of video resources.The sufficient bandwidth of server not only avoids the overload caused by large-scale access due to low resource supply capacity in overlay network during initial simulation but also promotes the resource distribution.The mobile nodes request video and act as core network in the built wireless mobile environment.The settings of bandwidth of mobile nodes avoid severe congestion caused by large-scale request, which does not result in super-long congestion recovery time.The transmission rate of video data is set to 128 kb/s.The two solutions employ the same network environment.Moreover, the number of video files requested by all nodes is set to 20 and the length of each file is 180 s.
We created 300 historical playback traces for the 300 mobile nodes where the historical playback traces have video structure and different attribute values.Further, we also generated 300 playback logs for the 300 mobile nodes where the playback logs include the ID and time of watched videos.After the nodes join the system, they play videos in terms of the playback logs.Once the nodes finish the playback for the videos in terms of the defined playback time, they continue to request new videos according to the playback logs.The surplus 200 mobile nodes do not join the system and are responsible for forwarding request messages and video data.We further describe the behaviours of 300 mobile nodes which join the system.Before the implementation of simulation, we assign the 100 historical playback traces for 100 mobile nodes.We make use of the proposed community construction method based on the measurement results of interest similarity to group the 100 mobile nodes into multiple communities.The members in the built communities request videos following Poisson distribution in terms of the playback logs and provide video resources for the request nodes.After the beginning of simulation, 100 mobile nodes join the system and request videos following Poisson distribution from the simulation time  = 200 s to  = 500 s.After the 100 nodes join the system, they join the corresponding communities in terms of the interest similarity values.In VCOSI, the surplus 100 mobile nodes do not actively request videos and join the system.Once they receive the invitation from the community members, they join the corresponding communities and request and play videos according to the assigned playback logs.In SPOON, the surplus 100 mobile nodes are randomly assigned the time of joining the system; namely, they randomly join the system from  = 0s to  = 500 s according to the assigned time and play videos according to the playback logs.
We define a random mobility model for all mobile nodes.The mobile nodes move from current location to target location in terms of the specific speed.After the mobile nodes arrive at the target location, the mobile nodes move to new target location in terms of new speed.The speed and initial and target location of mobile nodes are randomly assigned.

Estimate Accuracy of Video Similarity.
We compare accuracy of similarity estimation of four methods: SFEbSM, SFEbSMKS, JVSMfn, and JVSMaf.Based on the similarity measurement results of video pairs by student volunteers, we firstly calculate the Pearson correlation values according to the following: We make use of the calculated results of the Pearson correlation to further evaluate the accuracy of the proposed similarity measures and compare with other approaches.
Figure 2 shows the correlation coefficient with different  and .The correlation coefficient is higher with  ∈ [1.0, 1.8] and  ∈ [0.5, 0.8], which means that the common features are more important than the different features for the similarity measure methods.Moreover, when  = 1.5 and  = 0.5, the correlation coefficient reaches the maximum value 0.919.Therefore, the values of  = 1.5 and  = 0.5 are 1.5 and 0.5 in the following experiments, respectively.Figure 3 shows the measurement results of correlation coefficient of the four methods by using WordNet.Although JVSMfn only uses video name to estimate similarity between videos and JVSMaf investigates multiple features, JVSMaf cannot efficiently eliminate noisy words in the features.Therefore, the measure results of JVSMaf are lower than those of JVSMfn.SFEbSMKS and SFEbSM employ the semantic feature extension method to estimate the similarity of video content; they can obtain higher similarity accuracy than JVSMaf and JVSMfn by efficient elimination of noisy words and extraction of key words.Moreover, SFEbSMKS rely on the selection of key words based on the KNN classification to obtain better extraction effect of noisy words than SFEbSM, so that the estimation accuracy of SFEbSMKS is higher than that of SFEbSM.

Run Time.
The execution time of similarity estimation is defined as the run time, in order to compare efficiency of the proposed prefix-filtering-based algorithm (Prefix) with the brute force method (Bruteforce).
As Figure 4 shows, the two curves corresponding to Prefix and Bruteforce have the same rise trend with the increase in the data size.The green curve of Prefix is lower than the blue curve of Bruteforce.The increment and peak value of Prefix are less than those of Bruteforce.Prefix makes use of the prefix filtering method to decrease the large number of repeated comparisons.In the other words, Prefix eliminates more repeated video pairs than those of Bruteforce.The computation complexity of Prefix is lower than that of Bruteforce, so that the run time of Prefix is less than that of Bruteforce.

Precision, Recall
, and -Measure.Figure 5 shows the precision, recall, and -measure of SFEbSMKS and JVSMaf with  = 1.5 and  = 0.5.The precision results of SFEbSMKS and JVSMaf are 0.997 and 0.996, respectively.While the recall and -measure of SFEbSMKS are higher than those of JVSMaf, obviously, the precision, recall, and -measure of SFEbSMKS are better than those of JVSMaf.SFEbSMKS depends on the semantic feature extension and the key word selection to obtain better performance of similarity measure than JVSMaf. of average startup delay, PSNR, and maintenance overhead, respectively.

Average Startup Delay.
We use the time span between sending request message and receiving first video data to denote the startup delay.
Figure 6 shows the mean values of collected startup delay for the results of VCOSI and SPOON during a time interval  = 50 s.As Figure 6 shows, SPOON's green curve has the two processes of rise and fall with the whole simulation time.The green curve firstly increases from  = 0 s to  = 100 s and slightly falls from  = 100 s to  = 150 s.It continues to quickly rise from  = 150 s to  = 350 s and quickly decreases from  = 350 s to  = 500 s.The blue curve of VCOSI keeps the rise trend with the fluctuation from  = 0 s to  = 350 s and quickly falls from  = 350 s to  = 500 s.VCOSI's blue curve is higher than that of SPOON from  = 0 s to  = 200 s, but VCOSI's results are less than those of SPOON from  = 250 s to  = 500 s.
Figure 7 includes the mean values of collected startup delay for the results of VCOSI and SPOON in the process of every 30 request nodes requesting videos.As Figure 7 shows, SPOON's green curve has fast increase with the fluctuation in the process of simulation.Although VCOSI's results also quickly rise, the increment of VCOSI's results is less than that of SPOON.
SPOON estimates the relationship between community members according to the similarity of file name of watched video (SPOON does not describe the measurement method of communication frequency between nodes).Although the file name reflects main content of videos, there is the information loss for the video content with rich-content.It is difficult to accurately denote the interest similarity relationship between members for the measurement method of similar interest in SPOON.This leads to fragile relationship between members and low sharing performance of system.For instance, once there are the different interests between members, the request messages for video content difficultly are quickly responded by other intracommunity members.The low video lookup success rate increases the lookup number, which increases the lookup delay.Because SPOON does not consider the mobility of members in the process of community construction, the transmission performance of video data is easily subjected by the influence of node mobility.The fast increase in the number of request nodes leads to quick rise of startup delay.VCOSI estimates the interest similarity between members by making use of ontologybased semantical interest capture to measure the extensional video information, which obtains the accurate measurement results of interest similarity.VCOSI further obtains high video lookup success rate; namely, the more video request can be responded by intracommunity members, which reduces the lookup delay.On the other hand, the community members make use of the message exchange to invite the one-hop neighbor nodes to join the communities, which reduces the geographical distance between video requesters and suppliers.The transmission delay of video data may be reduced.The two solutions VCOSI and SPOON experience slight network congestion (from  = 200 s to  = 350 s) with increasing number of nodes requesting videos.The community members in VCOSI need to invite the mobile nodes to join the communities, so VCOSI's results are higher than those of SPOON from  = 0 s to  = 100 s.However, after the mobile nodes join the communities, VCOSI's results are better than those of SPOON from  = 250 s to  = 500 s.

Peak Signal to Noise Ratio (PSNR).
We use the PSNR of each video streaming to show the watched video quality according to the following equation which is defined in [32]: ) , (10) where MAX Bitrate is the transmission rate of video data and is set to 128 k/s; EXP Thr and CRT Thr are expected and real throughput, respectively.The value of EXP Thr of each video streaming should be equal to the value of MAX Bitrate.The value of CRT Thr is real throughput of video streaming received by each node.Figure 8 shows the mean values of PSNR for video streaming of each node corresponding to VCOSI and SPOON with increasing number of request nodes.As Figure 8 shows, the two curves corresponding to the results of VCOSI and SPOON have the same fall trend.The curve of VCOSI is lower than that of SPOON when the number of request nodes increases from 30 to 60.When the number of request nodes increases from 60 to 300, VCOSI's curve is higher than that of SPOON.
In SPOON, the measurement of relationship between community members mainly investigates the interest similarity based on the similarity calculation results of file name.The influence of node mobility cannot be avoided in the process of video data transmission due to the neglect for the investigation of mobility similarity.The increase in the number of request nodes brings the high demand for the network traffic, which results in the network congestion.The high packet loss rate caused by congestion severely influences the amount of video data received by nodes; namely, the decrease in the real average throughput of each node makes the values of SPOON's PSNR quickly fall.In VCOSI, the community members need to exchange the playback traces to find the potential community members with similar interests.The increasing number of request nodes, including nodes passively receiving invitation and nodes actively joining the system, brings high traffic demand.The high traffic of VCOSI also results in the increase in the number of lost packets, so that the values of PSNR of VCOSI are less than those of SPOON when the number of request nodes increases from 30 to 60.With the continuous increase in the number of request nodes, the network congestion also leads to the large number of packet losses and the fast decrease of VCOSI's PSNR.However, the negative influence levels of VCOSI caused by congestion are lower than those of SPOON (the curve of VCOSI's PSNR is higher than that of SPOON when the number of request nodes increases from 60 to 300).This is as the invited community members include the one-hop neighbor nodes of inviters.The close geographical distance between inviters and invited nodes reduces the influence from the node mobility and the risk of packet loss.

Maintenance Overhead.
The maintenance of member state, handling request message, and interaction between communities need to consume network bandwidth.The usage amount of bandwidth for maintaining the above information is considered as the maintenance overhead.
Figure 9 shows the maintenance overhead of VCOSI and SPOON during a time interval  = 50 s.As Figure 9 shows, the two curves of VCOSI and SPOON experience the rise with the whole simulation time.VCOSI's blue curve is lower than that of SPOON and the increment of VCOSI's results gradually decreases.The increment of SPOON's results still keeps the increasing trend.
SPOON relies on the coordinators and ambassadors in communities to maintain the member state, handle request messages, and interact with other communities.The increase in the number of members results in the large amount of consumption of bandwidth in order to maintain the logical links between members.Moreover, the increase in the number of members also brings more request messages for video content, so that the members need to consume large amount of bandwidth to handle the massive request messages.On the other hand, the inaccurate interest similarity measurement leads to fragile relationship between members.The fragile relationship between members brings severely negative influence for stability of community structure; namely, the community members continuously leave current community and join into new community.The reconstruction of community structure results in huge consumption of bandwidth.The low lookup success rate also causes repetitive search, which increases the bandwidth consumption.Therefore, SPOON needs to consume massive bandwidth to maintain the overlay network.VCOSI has stable community structure relative to SPOON because of the accurate interest similarity measurement between members.The consumption of bandwidth from the maintenance cost of community structure keeps low growth with increasing number of members.The high lookup success rate reduces the number of lookup messages, which reduces the bandwidth use.VCOSI has lower maintenance overhead than SPOON.Moreover, VCOSI distributes the maintenance overhead to multiple broker nodes, which does not cause the overload of broker nodes.

Conclusion
In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI) in order to enhance the stability and scalability of overlay networks in the process of network virtualization.VCOSI proposes an ontology-based semantical extension approach, which achieves distinct and precise description for video content and accurately measures the similarity between videos.In order to reduce the calculation load of mobile nodes, VCOSI employs a prefix-filtering-based estimation algorithm to reduce the comparison times between videos.VCOSI makes use of the similarity for watched videos to estimate the interest similarity level between nodes, construct node communities, and economically maintain community structure.The simulation results show how VCOSI has lower startup delay, higher video quality, and lower maintenance overhead than SPOON.

Figure 1 :
Figure 1: Multimedia streaming services in wireless mobile networks.

Figure 7 :
Figure 7: Average startup delay against number of request nodes.

Figure 8 :
Figure 8: PSNR against number of request nodes.

Table 1 :
Features of key words of video content.

Table 2 :
Accuracy rate of different classification methods.
KNN. Table2lists the accuracy rate of decision tree, SVM, and KNN.Obviously, KNN and SVM have higher accuracy rate than that of decision tree.For instance, the maximum values of KNN and SVM are 88.26% and 74.19%, respectively.

Table 3 :
5000 video pairs with volunteer ratings.