ODMBP : Behavior Forwarding for Multiple Property Destinations in Mobile Social Networks

1School of Computer Science & Technology, Nanjing University of Posts and Telecommunications, Jiangsu, Nanjing 210003, China 2Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Jiangsu, Nanjing 210003, China 3Institute of Computer Technology, Nanjing University of Posts and Telecommunications, Jiangsu, Nanjing 210003, China 4Department of Information Technology, Nanjing General Hospital of Nanjing Military Command, Jiangsu, Nanjing 210002, China


Introduction
In recent years, the smartphones have increased rapidly.According to the data from the International Data Corporation (IDC) Worldwide Quarterly Mobile Phone Tracker, the vendors shipped a total of 334.4 million smartphones worldwide in the first quarter of 2015 [1].Wireless mobile networks are evolving and integrating with many aspects of our lives since we can read news, watch videos, listen to music, communicate with others, send and receive emails, browse and search the web, share contents to Internet and trade online, and so forth, through the smartphones conveniently.The wide spread smartphones promote the combination of the online social network and the mobile smart terminals, accelerating the development of the mobile social network (MSN) [2].MSN involves the interactions between participants with similar interests and objectives through their mobile devices within virtual social networks.
Due to the dynamic and volatile nature of MSN, opportunistic networks operate under a completely new networking paradigm where traditional routing protocols cannot be applied [2].Opportunistic networks are wireless mobile self-organizing networks in which the topology is extremely dynamic and unstable.Thus, in most cases, there might not exist the complete link from the source to the destination simultaneously.There have been many research efforts on opportunistic forwarding.However, most of them deliver the message based on IP or device address, which is not effective in many interest-aware or behavior-aware MSN applications.
The unprecedented tight coupling between mobile devices and their users provides new approaches to infer users' behavior and interest from mobile devices.The mobile devices can now act as distributed behavioral sensors of users to capture their interests and enable implicit interest profiling [3].There are many popular location based applications in MSN.For instance, the location based services can help mobile users to find friends who are currently in their vicinity.Another example is Contact Recommendation Mechanism [4], which can efficiently select contacts in order to address them as a social group, so as to ease the initialization of group interactions.
The basic idea of these applications is extracting the interest profiles or relationship profiles from the social behavior.However, few research efforts consider the multiple behavior properties of MSN users.In fact, many MSN applications deal with the objects with multiple properties.Moreover, the human being has multiple interests naturally.There are some types of typical scenarios: (i) sharing or disseminating the information to the people with similar multiple interest profiles: as an example, Bob is a student, and he wants to find a roommate who is in the same university.He also hopes that the roommate likes swimming, just like himself.Now, Bob wants to push the message to the persons who have the great possibility to be the roommate; (ii) recommendation of commodities or services with multiple properties: for example, the editorial office wants to recommend a new magazine, which includes multiple topics, such as pop music, clothing, and bodybuilding.The editorial staff need to disseminate the advertisement to the potential readers who are interested in most of the topics; (iii) recommendation of the combination of heterogeneous commodities or services: for example, the merchant wants to publicize a discount combo of films and snacks and to send the information to the people who like both of them.
In the aforementioned scenarios, the message sender has a set of target interests, which can represent the commodities, services, or individuals with multiple properties.As shown in Figure 1, the message sender wants to send the message to the receivers who have the same or similar interests to the target interests.
In this paper, we focus on extracting multiple behavior properties from the daily traces of users and exploring the Opportunistic Dissemination Protocol based on the Multiple Behavior Profile in mobile social network.The key contributions of our work are summarized as follows.
(i) We aim to deal with the data dissemination in a class of ubiquitous application scenarios, where the multiple properties of objects or the multiple interests of people need to be considered.(ii) We map the multiple properties or the multiple interests to the behavior space and profile the multiple behavior properties.Moreover, we propose the correlation computing model based on the principle of BM25 [5] for multiple behavior profiles.(iii) We design an Opportunistic Dissemination Protocol based on Multiple Behavior Profile (ODMBP) in mobile social networks.(iv) The extensive simulations show that the proposed multiple behavior profiles and correlation computing model are correct and efficient.Compared to other classical routing protocols, ODMBP achieves high delivery ratio and low delay in the scenarios of multiple property data dissemination.
The remainder of this paper is organized as follows.Section 2 presents the challenges and the rationale of designed protocol.Section 3 introduces the multiple behavior profile and the correlation computing model.We present our opportunistic dissemination protocol in Section 4. The performance evaluation is presented in Section 5. We review the related work in Section 6.We conclude the paper in Section 7.

Challenges and Rationale
There are some challenges to design the opportunistic dissemination protocol for multiple property objects.First, we need to give a computable expression of interests, from which the multiple properties can be obtained.Second, for multiple properties, we need to consider the number of matched properties; that is, the designed protocol should try its best to match more properties in appointed properties.Thus, it might be inefficient to summate the similarity value of each property straightforwardly.Moreover, due to the energy limited devices and the intermittent link of opportunistic network, the designed protocol should meet the desired properties of distributed design, low computation complexity, low overhead, and high expandability.
As can be seen in Figure 2, many interests are closely related to the individual's daily trace and can be represented by the specific locations in the trace.It is shown in [6] that social relationships can explain about 10% to 30% of all human movements based on an analysis of different kinds of location datasets.On the other side, a large body of research has demonstrated that people show striking persistence in their mobility profiles.For example, in [7], the authors state that the similarity of the mobility profile of a given user to its future profile is high, above 0.75 for eight days, and remains above 0.6 for five weeks.The observations demonstrate that the mobility profile is indeed an intrinsic property and a valid representation of the user, even if only a short history of mobility profile is used.Therefore, in this work, we assume that the locations can represent the user's interests; moreover, the longer the time in one location is, the stronger the corresponding interest is.
The basic idea of ODMBP is mapping the interest space to the behavior space and extracting the similarity between users' multiple behavior profiles and the target profiles.ODMBP uses the locations and the corresponding time spent at the locations to reflect users' preferences.The multiple behavior profile of each user is extracted from the quantized behavior space.Further, a reasonable correlation computing model should be applied to calculate the correlation metric.The designed opportunistic dissemination protocol then takes the correlation metric based forwarding strategy as the basic principle.

Multiple Behavior Profile and Correlation Computing Model
In this section, we will introduce the multiple behavior profile and the correlation computing model.The behavior profile should reflect multiple behavior properties, respectively, according to multiple interests.The correlation computing model should quantize the correlation for each user and should match as many behaviors as possible in the set of appointed behaviors.
Note that   is a cumulative time based on current trace of user , and the value would be changed when time goes on.The time that user spent at the specific location can be measured through different ways.A widely used method is sensing the location information continuously through GPS sensors, which are integrated universally in the smartphones.Alternatively, the connection log of WiFi or switches in specific location can also help to obtain the time that the user spent.This work does not involve the specific persistent sensing, and the energy consumption can be very low.
The user multiple behavior profiles can be expressed as UMBP = (UMBP 1 , UMBP 2 , . . ., UMBP  ).It can be viewed as  ×  behavior matrix with element   .An example of UMBP is given in Figure 3.In most cases, the behavior matrix is a sparse matrix since most users only stay at a small fraction of all  locations.Thus, some specific data structures such as triple table can be used to reduce the space and time There is a target multiple behavior profile TMBP = ( 1 ,  2 , . . .,   ) for each specific data dissemination application, where   = 1, 1 ≤  ≤ , if the message sender hopes that the receivers have the behavior property associated with location ; else,   = 0. We assume that TMBP, which can be obtained through mapping the interests to the corresponding locations, is known in advance.We further denote the number of behavior properties in TMBP as  = ∑  =1   .

Correlation Computing
Model.To find the potential receivers, we need a computing model to calculate the correlation between the target multiple behavior profile and user 's multiple behavior profile, 1 ≤  ≤ .The correlation is quantized by metric Score  (TMBP, UMBP).We use the principle of the ranking function, named BM25 [8], to calculate this metric.BM25 uses the ideas of Robertson-Sparck-Jones (RSJ) probability model [9] and is a ranking function used by search engines to rank matching documents according to their relevance to a given search query.So far, it is the most successful model for calculating the correlation [10][11][12].We first define the behavior factor of user  to location  in TMBP as where  is an empirical parameter, which represents the importance of behavior factor in Score  (TMBP, UMBP).The behavior factor BF  measures the total time user  spent at location  in TMBP.BF  provides a basic correlation evaluation.However, it might not meet the requirement of matching as many behaviors as possible in TMBP since the behavior factor does not consider the distinction of user distribution at different locations.Actually, there might be some locations where few people stay in general.Thus, the behavior factor, which only considers the time the user spent at the location, might lose sight of these sparsely populated locations.To balance the sparsely populated locations, we introduce   , the weight of location ,   = log  −   + 0.5 where   = ∑  =1   is the number of users at location .Note that the greater the value of   is, the smaller the value of   is.The weight of location  reflects the distinction of user distribution at different locations and can promote the importance of sparsely populated locations in TMBP.
Now we can give the ultimate formula to calculate the correlation metric: Note that the Cosine similarity [13,14] is another method to compute this metric as well.The Cosine similarity is widely used for computing the similarity of the text.It is not difficult to use Cosine similarity based on our user multiple behavior profiles.However, the Cosine similarity does not consider the distinction of user distribution at different locations.Further analyses and evaluation will be given in Section 5.

Opportunistic Dissemination Protocol Design
In this section, we attempt to design an opportunistic dissemination protocol for the services with multiple property objects.According to the principle of small world [15], people have high clustering property, and the users with similar behavior property have high probability of encounter.ODMBP disseminates the messages based on users' multiple behavior profiles and corresponding correlation metric.
The UMBP of each user will change with elapsed time, and it should be updated in distributed way.In ODMBP, each user  stores UMBP in his mobile device.UMBP  can be updated by itself through position sensor or network connection log, while UMBP  ,  ̸ = , will be updated when user  encounters user .
As shown in Algorithm 1, ODMBP consists of three stages: user initialization, gradient ascent, and group spread.In the user initialization stage, for each encountered user , the message sender  matches the UMBP  with the unmatched target multiple behavior profile TMBP  .If there is at least one matched location , that is,   =    = 1, the message sender  sends the message to user .Once all locations in target multiple behavior profile are matched, that is, TMBP  = 0, the message sender  deletes the message.By this way, the user (1) TMBP  ← TMBP; (2) foreach  encountered do (3) update UMBP for  and ; (4) if  is a message sender then (5) if TMBP  ̸ = 0 then // Stage 1: User Initialization (6) foreach  ∈ {1, 2, . . ., } do (7) if   ̸ = 0 and    ̸ = 0 then (8) send message to ; ← 0; (10) b r e a k ; (11) else (12) delete message in i; (13)  initialization stage can parallelize the dissemination process and decrease the delay efficiently.Then, in the gradient ascent stage, the message holder forwards the message to the users with higher correlation score.The gradient ascent stage is derived from the fact that the multiple behavior profiles of the users with higher correlation score are more similar to the target receivers.
In the group spread stage, if the correlation score of the message holder is higher than threshold , where  is a parameter of ODMBP, the message holder copies the message to the users with higher correlation score.This means ODMBP considers all user  satisfying Score  (TMBP, UMBP) ≥  as the receivers.

Methodology and Settings.
In this section, we conduct thorough simulations to investigate the performance of ODMBP.We use the real trace dataset StudentLife [16], which contains the sensor data, EMA data, survey responses, and educational data.For our simulations, we adopted a part of this dataset, named Wifi-Location, which contains the data of 49 volunteers moved around 92 buildings in Dartmouth College within a month.The Wifi-Location, which contains nearly 0.192 million mobility records, acquires WiFi AP deployment information from Dartmouth Network Services and records participants' on-campus rough locations and unix time stamp.As an example, the record (1364359102, in (Kemeny)) indicates a volunteer moved in the building called Kemeny at the unix time 1364359102.The buildings can be seen as the locations in UMBPs and TMBP.We removed the interference items in the real movement trace such as the duplicate data and the invalid users.Figure 4 describes the number of locations of each user of the processed data and this number mostly falls in the interval [5,100].
All the simulations were run on ONE simulator [17]; it is an opportunistic network environment simulator which provides a powerful tool for generating mobility traces, running DTN messaging simulations with different routing protocols.All the results are averaged over 1000 runs.The settings of the ONE simulator have been listed in Table 1.We first integrate the continuous records with the same location into a new record in order to compute their time difference.We also need to remove some interference items such as duplicate data and invalid user.We take this final output results as the external events connection data for the simulator.The number of hosts and the number of locations are 49 and 92, respectively, which are equal to the number of volunteers and buildings in Wifi-Location trace.
We use the time-location pairs to structure UMBP  of any user .As the simulator time goes on, the time spent in specific location can be obtained through calculating the elapsed time from the time stamp of user 's current mobility record.By this way, we can obtain UMBP  through accumulating all such elapsed time for each location.The UMBP is privacy information for each user and is calculated and updated dynamically with the simulator time.Moreover, the users can connect with the users who are in the same location simultaneously.So, we can structure the external event connection data dynamically for the ONE simulator based on the above processing method.In our simulations, the behavior locations in TMBP are selected randomly among 92 buildings.
In our simulations, we first reveal the impacts of the key parameters on delivery ratio and delay of ODMBP.Moreover, we evaluate ODMBP further by comparing it with other protocols: Epidemic routing [18], Spray and Wait [19], and ODMBP-Cos.
To explore the differences between Cosine and BM25, we apply the Cosine similarity to our system model, and the correlation score function based on Cosine similarity is CosSim  (TMBP, UMBP) =   ⋅               ×           , where ⋅ is the vector product and ‖‖ is the Euclidean norm of ; that is, √ 1 2 +  2 2 + ⋅ ⋅ ⋅ +   2 .We substitute the correlation score function with CosSim (TMBP, UMBP) in stage two and stage three of Algorithm 1, respectively.We call the protocol using Cosine similarity as ODMBP-Cos.

Revealing the Impacts of the Key Parameters.
There are three key parameters: the empirical parameter , the number of behavior properties in target multiple behavior profile , and the threshold .We will vary them for exploring the impacts of these parameters, respectively.

Impact of 𝐾.
Based on our correlation computing model,  represents the importance of behavior factor in ultimate score metric.When using BM25 model in searching,  usually gets the value of 1.2 based on past experience.However, this setting might not be applicable in our multiple behavior dissemination scenarios.For the purpose of revealing the impact of  on ODMBP, we measure the delivery ratio and delay of ODMBP with different value of  when setting  = 5.As shown in Figure 5, ODMBP gets the best delivery ratio and delay when  = 1.75.Based on the observation, we fix  = 1.75 in the following simulations.However, the setting of  may be closely related to the real dataset adopted.

Impact of 𝛿.
Threshold  is a criterion to judge whether the user is a receiver.It is also the trigger of ODMBP to enter the group spread stage.We measure the performance of ODMBP with different . Figure 6 shows three groups of results corresponding to  = 2,  = 6, and  = 10, respectively.When the value of  goes on, the delivery ratio decreases drastically for all settings of .This is because the number of receivers reduces when the threshold increases.Accordingly, it takes more time to find the receivers, and the delay increases.

Impact of 𝑟.
The number of behavior properties in target multiple behavior profile , which is provided in advance, indicates the comprehensiveness of commodities/services or people's versatility.We cannot adjust the value of  to improve the performance of ODMBP; however, we can evaluate the scalability of designed protocol through the observation of the impact of  on ODMBP.We can see from Figure 7 that the curves of delivery ratio are not monotonous.Based on formula (4), Score  (TMBP, UMBP) = ∑  =1   × BF  ; thus, the score is a summation value for all behavior locations in TMBP.Note that   = log(( −   + 0.5)/(  + 0.5)), and the value of   will be negative if   > /2.Thus, the value of score might decrease with great value of .As a result, the number of receivers would reduce.As can be seen from Figure 6, ODMBP achieves the best performance in the aspect of delivery ratio and delay when  = 6 among all measured  in our simulations.

Compare with Other Protocols.
We compare ODMBP with other classical routing protocols, Epidemic routing and Spray and Wait in opportunistic network.In Epidemic routing protocol, the message is delivered to each encountered node that does not have the same message.The Spray and Wait routing protocol consists of two phases: Spray and Wait.The message copies are forwarded to  different nodes in the spray phase, and then the direct transmission is performed in the wait phase.We set  = 6 and apply binary mode in the spray phase of Spray and Wait routing protocol.We set  = 6,  = 1.75, and  = 0 for ODMBP.We also compare the performance of ODMBP and ODMBP-Cos.We set  = 6,  = 0.7 in order to obtain the best performance of ODMBP-Cos.Such settings are based on the similar measures in Section 5.2.
As shown in Figure 8, ODMBP has higher delivery ratio compared with ODMBP-Cos.This is because there are some locations where few people stay in general.Thus, the correlation function based on Cosine similarity, which only considers the time spent at the location, might lose sight of these sparsely populated locations.However, ODMBP can balance it well.On the other hand, the delay performance of two protocols is close.
The delivery ratio increases with increasing message TTL for all four protocols.This is because there is more time to   deliver the message to the receivers before dropping it in the forwarding queue.However, the delay increases when the message TTL increases.Epidemic routing achieves the best performance among four protocols; however, it will suffer high overhead and is not efficient in our mobile social network applications.This is because Epidemic routing does not provide filtering scheme in the dissemination.The ONE simulator defines the parameter overhead ratio (number of relayed messages − number of delivered messages)/number of delivered messages, while ODMBP has the threshold to filtrate the user with different correlations.Thus, ODMBP can reduce the amount of relayed messages.As shown in Figure 8, ODMBP has lower overhead ratio than Epidemic routing.In most cases, the performance of ODMBP is better than Spray and Wait, and ODMBP improves 11.6% and 12.5% in the aspect of delivery ratio and delay, respectively, on average.This is because ODMBP can forward the message to the users who are more similar to the target, while Spray and Wait does not consider the correlation metric.

Related Work
At present, there are many studies on exploring the behavior attributes of users in mobile social networks.In [7] Besides, [27] explains the interaction relationship of social network users and mutual influence and social network privacy behavior characteristics and motivation, including the prediction of user behavior as well.

Conclusion
We have extracted the multiple behavior profiles from the users' daily trace through mapping the multiple properties in the interest space to the behavior space.The BM25 based correlation computing model was proposed to calculate the correlation metric of multiple behavior profiles.efficient.Compared to the other classical routing protocols, ODMBP can significantly improve the performance in the aspect of delivery ratio and delay.
In the future work, we will consider more complex scenarios.For example, the behavior locations in the target multiple behavior profile can be associated with specific weights, which indicate the importance of the behavior locations.

Figure 2 :
Figure 2: Mapping the interest space to the behavior space.
user i spent at each location the total time that user i spent at j location Location j Location

Figure 4 :
Figure 4: Distribution of location number.
[26]s based on the observation that individuals with similar interests tend to meet more often.In[25],Matsuo et al.propose an efficient boundary detection method in dense mobile wireless sensor networks.Each node preliminarily recognizes locations of itself and all its neighboring nodes.The authors determine the node forwarding direction by comparing the similarity score with the encounter nodes.Cheng et al. present iZone[26], a mobile social networking system based on the analysis of general requirements of MSN and location based services (LBS).The ultimate goal is developing and establishing an integrated framework for providing social network based healthcare information services targeting patient safety, empowerment, and guidance.
[24]iles.In this paradigm, messages are sent to a behavioral interest profile (not to an IP or device address).It combines user's interest and behavior for multicast communication.In[23], Elsherief et al. explore the notion of mobile users' similarity as a key enabler of innovative applications hinging on opportunistic mobile encounters.SANE[24]combines the advantages of both social-aware and stateless approaches.
Moreover, we have proposed an Opportunistic Dissemination Protocol based on Multiple Behavior Profile termed ODMBP, in mobile social networks.It consists of three stages: user initialization, gradient ascent, and group spread.Through extensive simulations, we have demonstrated that the proposed multiple behavior profiles and correlation computing model are , : Set of users and the number of users , : Set of locations and the number of locations : Behavior factor of user  to location