Efficient Data Transmission for Community Detection Algorithm Based on Node Similarity in Opportunistic Social Networks

,


Introduction
With the booming of information technology and the popularization of wireless network equipment [1], people have a growing demand for the network. As a fresh type of self-organizing network [2], an opportunistic social network has attracted researchers' attention [3]. ere is no complete end-to-end path between nodes in opportunistic social networks [4]; it uses the encounter opportunities brought by node movement to communicate hop by hop [5]. At present, opportunistic social network has widespread use in various fields, such as mobile phones [6], handheld electronic devices [7], vehicular networks with mobile intelligent devices on the road [8], wildlife tracking [9], and network transmission in remote areas [10]. e traditional social network method to deal with data transmission faces significant challenges [11], which will become an obstacle to the information exchange and sharing [12]. To enhance data transmission in a 5G wireless network [13], we should design a more convenient model to achieve data forwarding flexibly [14]. e user terminal equipment needs to transmit a large amount of data and needs to calculate these intensive tasks [15]. To enhance wireless devices' computer ability, mobile edge computing (MEC) is proposed [16][17][18]. Because the mobile edge server locates at the edge of the wireless network and closer to the users, it can efficiently provide the surrounding users' services and integrate the concept of opportunistic social networks into mobile edge computing, to reduce the consumption of source nodes [19].
However, each node has many social attributes [20]. ey represent the relationship among different users, and the connections between nodes in the same community are more than closer [21]. So, the network nodes can be divided into communities by their different attributes to improve the algorithm's performance [22]. e existing algorithms do not fully consider nodes' characteristics, so there is a large space for improvement in community detection accuracy and efficiency [23]. at is why it is necessary to propose an efficient community detection algorithm.
Opportunistic social network uses the strategy of "storing-carrying-forwarding" to handle the energy consumption problem in the data transmission process [24]. Messages are forwarded through encounter opportunities produced by node movement. In this paper, the network topology attributes and social attributes are used to measure the similarity between nodes, and the hierarchical clustering method effectively divides the community [25]. In the process of data transmission, if the mobile device does not have a suitable transmission target, the message will occupy a lot of cache, and the data transmission in the community is likely to wait a long time and cause the delay in transmission [26]. After dividing the community, we need to further establish the weight distribution between nodes and community to reduce the time complexity and overhead cost and construct a set of candidate relay nodes based on the relationship between information forwarders and adjacent nodes. From the perspective of minimizing bit error rate, the channel coefficients of the two channels from the source node to the relay node and the relay node to the destination node are analyzed. is must select the optimal relay node from the set of candidate relay nodes as transmission. In summary, we propose an efficient data transmission strategy for community detection in opportunistic social network using mobile edge computing combined with network topology and social attributes. e transmission strategy is divided into two periods: the initialization period and the routing period. e contributions of this research study are as follows: (1) Initialization period: using network topology attributes and social attributes to measure the similarity between nodes, a community detection algorithm is proposed through hierarchical clustering. (2) Routing period: based on the relationship between the message forwarder and the adjacent nodes, a set of candidate relay nodes is constructed. By analyzing the channel coefficients of the source node to the relay node and the relay node to the destination node, a method for selecting the optimal relay node is proposed. (3) Simulation results show that the algorithm EDCD proposed in this paper has good performance such as delivery ratio, routing overhead, and average end-toend delay in different real datasets.

Related Works
Many researchers have conducted research on routing and forwarding algorithms in opportunistic social networks and proposed very effective approaches in different application scenarios in recent years. Many research methods have focused on algorithm research. Routing algorithms can be roughly dividing into two sorts: existing social-ignorant algorithms and existing social-aware algorithms [27]. Existing social-ignorant algorithms mean that social message relating to nodes will not make adaptable messaging decisions in the process of data transmission. Vahdat and Becker [28] proposed the epidemic routing algorithm. Epidemic algorithm is essentially a flooding algorithm, and each node forwards information to all its neighbors. However, there are a lot of message copies in the network, which will consume many network resources. Sisodiya et al. [29] proposed a flood routing algorithm, that is, spray and wait algorithm, which divides the information forwarding process into two steps. e first step is to copy the message and the transmission process is in the second step. It can easily lead to ultratransmission delay and data redundancy.
Sharma et al. [30] proposed a routing protocol named MLProph, which uses machine learning (ML) algorithms, namely, decision trees and neural networks, to determine the probability of successful message delivery, but this algorithm has great limitations. Tang et al. [31] proposed a scheme based on reinforcement learning (RL), which can apply to opportunistic routing transmissions that require high reliability and low latency. However, this opportunistic routing scheme can only be used for specific scenarios and is not for all networks. Wu et al. [32] proposed the algorithm that adjusts the cache by analyzing the importance of message propagation. is algorithm has a small routing overhead, but to avoid deleting the cached data, the data shares by adjacent nodes will cause data redundancy.
Social-aware algorithms refer to the social relationship between nodes to measure the transmission relevance between nodes. Yan et al. [33] established an effective data transmission strategy (ENPSR), which uses the priority of nodes and social relationships in opportunistic social networks. Obtain the data transmission priority by measuring the social attributes and historical information of the node. en use the forecast plan to determine the appropriate message delivery decision. Wu and Chen [34] proposed an optimal routing scheme for cooperative nodes based on opportunistic network features.
is scheme can use in social networks. By reliability, availability, and weighting factors are used as the weights of human activities to obtain the optimal cooperative node, but the algorithm has a high routing overhead. Drǎgan et al. [35] proposed that nodes can be divided into several communities according to their intimacy and the time together. is community detection method does not fully consider all of the nodes in the community.
Zeng et al. [36] proposed a social-based clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, but this can cause data redundancy issues. Liu et al. [37] proposed an algorithm using node similarity (FCNS) based on fuzzy routing and forwarding. is algorithm has good performance in data transfer ratio and routing overhead but high transmission delay. Niu et al. [38] proposed a predictive and extended routing protocol, which uses Markov chain as a node mobility model to realize the social characteristics of nodes. It does not consider node communication between different places, and nodes just upload and send message in the same place.
Because the abovementioned traditional methods do not fully consider node characteristics and other problems, this 2 Complexity paper proposes a model that combined with the network topology and social attributes to detect community and analyze the channel coefficients of source node to relay node and relay node to destination node to select optimal relay node as information transmission in opportunistic social networks. is model can effectively handle the challenge of improving data transmission and has good performance of low delay and low routing overhead.

Model Design
In opportunistic social networks, we can define the topological structure G � (V, E, w), where Vis the node of the network and E is the edge set in the network reflecting the relationship between the nodes. E � (m, n)|m ∈ V, n ∈ V { }, m and n are nodes, and w is the weight of the edges of node m and node n. On the basis of the division of the community, we make C � C1, C2, . . . , Cn { }, which require more edges between vertices in each community subgraph. We consider that there will be differences between nodes and the number of encounters between nodes to weight each edge. is paper proposes to measure the similarity between different nodes in terms of network topology attributes and social attributes. e greater the similarity is between nodes, the more likely they are to belong to the same community.
Firstly, we must reasonably define the similarity between nodes. For a real social network, and considering the network topology, we also need to consider the social attributes between nodes. We must collect the data of the node, and the process is shown in Figure 1. e nodes information collection method is that the base station collects all node information in the area within a period of time. When the node has a transmission task, request the probability table of the source node and the destination node from the base station that has collected the information and use edge computing to transmit decision information to reduce node's workload. Because many communities can usually only share messages based on one or two nodes, there must be enough cache to improve data transmission efficiency. e node requires obtaining the position, speed, and moving direction of itself and the destination node. However, the encounter of nodes in opportunistic social networks is random. Combining the characteristics of node movement to calculate the probability of node encounters, in this paper, PE mn means the probability of nodes m and n meeting in a period of time t, and the node meeting interval time obeys the exponential distribution; then the probability of node m and node n meeting within the sensing range is where m is the source node, n is the destination node, λ mn � (1/Δt mn ) is the encounter strength of node m and node n, and Δt mn is the average time between node m and node n: where t k mn is the time of the kth encounter, and we define t 0 mn � 0. In short, combine the formula to get where t r � t init − t o is the remaining time to live of the message, t init is the initial time to live of the message, and t0 is the current time the message has been alive. Secondly, construct the encounter probability matrix. e number of encounters between nodes to a certain extent only reflects the number of encounters of the node in a period of time.
MT � Use the number of encounters between nodes to weight each edge. w � w mn |e mn ∈ E is the set of edge weights, where wmn is the number of encounters between two nodes.MT � PE mn (t) represents the n * n encounter matrix of node m and node n in a period of time t.
where w m,on represents the number of encounters the node m has met with other nodes within a certain period of time.
In opportunistic social networks, network topology attributes reflect the status of the network. It requires more edges between the vertices in each community subgraph.
(1) e strength of nodes describes how close the node is to the surrounding network, and the node strength is equal to the degree of the node, that is, the number of neighbor nodes. e defined formula is where ND (m,n) is the node connection strength between nodem and node n.NC m is the set of neighbor nodes connected to a node m in current times, and NC n is the set of neighbor nodes connected to a node n in current times. We have to consider that two nodes may share a set of similar neighbor nodes, so the higher the relationship between them, the higher the probability of data transmission. (2) e direct connection strength represents the influence of the direct connection between two nodes. When there is an edge between two nodes, the edge weight measures the strength of the connection between them. We define the sum of the weights of all edges adjacent to node m as s(m) � n∈θ(m) w mn , where θ(m) is the set of neighbor nodes of m. For any θ(m), there is a relationship between node m and node n. So the formula for direct connection strength is as follows: Complexity where DC (m,n) is the strength of the direct connection between two nodes and is also the ratio of the weight of the two nodes to the weight of their adjacent edge. (3) e indirect connection strength indicates the influence of the indirect connection between two nodes; just as when node m and node n have a common adjacent node p, then node m and node n also have a certain chance to connect. e more adjacent nodes that two nodes have in common, the closer the two nodes are. So the formula for indirect connection strength is as follows: where w mp and w np represent the connection strength between node m and node n through node p, and the indirect connection strength between nodes is the sum of the strengths of all common neighbor connections. at is to say, the more common adjacent nodes the two nodes have, the greater the indirect connection strength is.
In the network topology attributes, we classify the possible relationships between two nodes into the following four types, where we use SMD tp(m,n) to express topological similarity between node m and node n.
(a) No direct and no indirect connection: (b) Indirect but no direct connection: (c) Direct but no indirect connection: (d) Direct and indirect connection: where α is the coefficient of the strength of node, φ is the coefficient of the direct connection strength, and η is the coefficient of the indirect connection strength. e higher the topological similarity between nodes, the greater the chance of communication between nodes, which can improve data transmission efficiency.
e social attributes between nodes measure the social similarity between two nodes.
(1) e geographic relevance of nodes: the node has mobile characteristics; the mobile node's trajectory information is used to analyze the geographic location correlation of the node. e trajectory  Complexity information refers to the geographic location information of the sensing area. e sensing area is the area where the node can transmit messages within a certain range. Specifically, in the time period T, if the nodes' geographical locations are close, it means that the probability of node information transmission is high; that is to say, the probability of meeting in the same area will also be increased. e geographical correlation between nodes can be expressed as where GD (m,n) is the geographic relevance of nodes, A(R rm m , R rn n )represents the similarity function of node m and noden at position A, R rm m represents rmth trajectory information of node m, and R rn n represents rnth trajectory information of node n.
where max E rm m , E rn n takes the maximum value between E rm m and E rn n , E rm m is the time when node m enters the sensing area for the rmth time, and E rn n is the time when node n enters the sensing area for the rnth time. min Q rm m , Q rn n represents take the minimum value between Q rm m and Q rn n , Q rm m is the time when node m quits the sensing area for the rmth time, and Q rn n is the time when node n quits the sensing area for the rnth time.
(2) e interesting relevance of nodes: users with common interests will visit the same business. Naturally, mobile users with the same interests will spend more time and energy communicating together. e information transmission between nodes will be carried out between mobile users with the same interest in the time period T. e interesting relevance between nodes can be expressed as where IR (m,n) represents the interesting relevance between node m and node n. T k m,n represents the ratio of time occupied by node mand node n during the kth transmission of information in time period T. T k−1 m,otn represents the ratio of the time occupied by node m and other nodes except node n in the k-1th transmission information in time period T.
(3) e separating time relevance of nodes: two nodes can make a connection and communicate. e average interval between two nodes can be defined as the time interval when two nodes meet each other. If there is no communication for a long time, the relationship between the two nodes is not close enough. Conversely, a shorter separation means that the two nodes are closely related. e separating time relevance of nodes can be expressed as where AS (m,n) represents the separate time relevance of node m and node n to convey information. T k m,n is the time of the kth transmission of information in the time interval T. T 1 m,n is the time of the first transmission of information in the time interval T.
rough the above calculation of social attribute values, we can quantify the relationship between node m and node n. SR (m,n) represents the similarity of social attributes as follows: where β is the coefficient of the geographic relevance of nodes, δ is the coefficient of the interesting relevance of nodes, and ρ is the coefficient of the separating time relevance of nodes. e higher the node's social attribute value, the higher the closeness between the nodes and the higher the probability of encountering communication, which will improve the efficiency of information transfer between nodes. Node similarity is affected by the network topology and social attributes. NS (m,n) represents the similarity between node m and node n. Correspondingly, in this paper, we define node similarity to be composed of network topology and social attributes, and the node similarity formula is rough the above description, we can know the relationship between nodes more accurately. e higher the node similarity, the more frequent the communication between nodes. Source node can accurately find the relay node and then transmit information to the destination node by establishing a community [39]. e information transmission in this process is more efficient, and the time delay reduces. e nodes within the same community are closely connected. Community detection is essentially the clustering of nodes with a tight structure in the network. is paper uses a hierarchical clustering algorithm to divide the community. Lead in modularity Q, which is used to measure the degree of community division. e fast unfolding algorithm considering data scale, running time, and other aspects of the community division results is ideal. e algorithm is stable and will continuously merge nodes to construct new graphs, which significantly reduces the calculation amount. e algorithm steps are as follows: Step 1: initialize and calculate the node similarity; divide each node into the community where the adjacent node is located. As shown in Figure 2, the source node S is in community one. We try to move the node S to community two and community three. Calculate the corresponding modularity value, and move the node S to the corresponding community with the largest change value. We lead in modularity Q to measure the Complexity degree of community division. e specific calculation formula is as follows: where Q is the modularity, in represents the number of connections within the community, tot represents the sum of degrees of all nodes in the community, and l is the sum of weights in the network.
Step 2: select each node one by one, and calculate the modularity gain divided into the community where the adjacent point is located. ΔQ represents modularity gain, and the calculation formula is as follows: where k m,in is the sum of weights from nodem to the community and km is the sum of the weights of node m. After calculating the modularity gain, we have to determine whether it is a positive number; if it is a positive number, it will be divided into the corresponding community; otherwise, no division will be made.
Step 3: repeat Step 2 until the node's community no longer changes.
Step 4: construct a new graph; each point in the new graph is each community divided in Step 3; continue to execute until the community structure does not change.
is paper roughly divides the above algorithm steps into two stages: Stage 1: divide each node into the community where the adjacent node is located so that the modularity value becomes more immense. Stage 2: the communities divided in the first stage are aggregated into one point, and the network is reconstructed until the structure of the network no longer changes.
is paper draws on the hierarchical clustering idea of the fast unfolding algorithm. We use network topology attributes and social attributes to express node similarity and comprehensively calculate node similarity to update network weights. In the first stage of node merging, we form an initial community to merge and improve the overall modularity and then calculate modularity gain; if ΔQ is positive then the two communities are merged; otherwise they will not be merged. e modularity gain is calculated repeatedly, and the final division result is output.
Nodes have the characteristics of random movement, and it is vital to establish a community. In opportunistic social network, many communities can usually deliver messages based on only one or two nodes. If these nodes do not have enough cache or overhead, data transmission in the community is likely to wait a long time. erefore, after we divide the community, we need to establish further the weight distribution between the nodes and the community reconstruction so as to reduce the time complexity and overhead cost better. Below we will prove the changes in the community of the source node during the movement.
We define at time t, Q is the degree of modularity of the community, E we is the total weight of weight, E wec is total weight of the edges of community c, Dm is the degree of node m in community c, and ΔE we is the increment of edge weight.  6 Complexity Proposition 1. In opportunistic social networks, the weight of the edge made by a node with other adjacent nodes in the network increases; the community relevance also will increase.
Proof. With time t, the modularity in the community is Q(t).
When the time increases to t + 1, the modularity change in the community can be expressed as We can get ΔE we > 0, so we just need proof In other words, It is known that 2Q is the total of nodes in the network, and no community in the network appears more than 2Q. In short, we are aware that increasing the weight can increase the community's relevance in opportunistic social networks. For this paper, the weight will affect the community's relevance in opportunistic social networks, and the proposition holds. □ Proposition 2. If the weight of an edge of two communities increases, node m is in community A, U − comm B will be increased, and U + comm A will be decreased. e community corresponding to the node m will change, and the weight of an edge between the node and the community is ΔE we (ΔE we > 0); if the weight of the edge can be changed, the result of the community will also change.
Proof. Before the weight changes, for node m, After the weight changes, for node m, Because E we > 0, when 2E we � di, All in all, if the weight of one side increases, then for node m U − comm B increases. en, Because ΔE we > 0, E we > 0, D A − D m > 0, for all edges in the network D i > D m D m − E we < 0, U + comm A − U + comm A < 0. If the weight of one side increases, then U + comm A decreases.
If the weight of an edge of two communities increases, node m is in community A, then U − comm B will increase and U + comm A will decrease. □ Proposition 3. If node m and node n are connected, and one of the nodes has one and only one edge, when the weight between node m and node n drops, the community will not divide.
Proof. Let us assume that the community is divided; then the following three conditions must be met: As the weight changes, the formula can also be expressed as So, it can be seen from the above proof and we conclude that For a node in opportunistic social networks, if it has only one edge connected to another node, the community will not divide when the weight between the two nodes decreases.

Complexity
After community detection, we construct a set of candidate relay nodes according to the relationship between the information forwarder and adjacent nodes. Select the optimal relay node from the set of candidate relay nodes to undertake the transmission task. erefore, selecting one or more relays among multiple relay nodes to participate in transmission has become our concern. As shown in Figure 3, when the community is established and transmitted between each community, it is necessary to find a reliable relay node to transmit information. To achieve higher efficiency, construct a set of candidate relay nodes from the neighbor nodes of the source node; from the perspective of minimizing the bit error rate, this paper analyzes the channel coefficients of the two segments of the source node to the relay node and the relay node to the destination node and chooses the AF protocol as the relay node's forwarding method, which is suitable for the information transmission process of various channel qualities [40]. Calculate the sum of the channel coefficients of the channel corresponding to each relay node, and find the largest coefficient of the relay node, which is the optimal relay node and will improve the efficiency of information transmission.
Let us suppose there are a source node S, destination node D, and relay nodes R 1 , R 2 , . . . , R n , when transferring information between communities. e communication model is as shown in Figure 4. In this case, the channels from the source node to the destination node and the source node to each relay node are all Rayleigh fading channels, which obey the Rayleigh distribution. We assume that the channel coefficient from the source node to the destination node is C s,d , the channel coefficient from the source node to the nth relay node is C s,rn , and the channel coefficient from the nth relay node to the destination node is C rn,d . e transmit power of the source node is P 1 , and the transmission power of the relay node is P 2 . When there is a direct transmission from the source node to the destination node, the power P � P 1 + P 2 . When the source node sends information i to the destination node and the relay node is with power P1, noise from the source node to destination node is Vs, d and noise from the source node to the relay node is V s,r . So information received by the relay node and the destination node is as follows: In the AF protocol, when the relay node receives the signal from the source node and forwards it to the destination node, it will amplify the received signal, and the scaling factor is We can know that the signal from the relay node to the destination node is χ s,r , and then the information sent by the relay node to the destination node is is paper's focus on selecting the optimal relay node is how to find an optimal relay node that makes the channel coefficients of the source node to the relay node and the relay node to the destination node larger. e channel coefficient matrix from the source node to the relay node is A, and the channel coefficient matrix from the relay node to the destination node is B. en, A 1×n � C s,r1 , C s,r2 , C s,r3 , . . . , C s,rn , r1,d , C r2,d , C r3,d , . . . , C rn,d .

(33)
We define a threshold for the number of candidate relay nodes ψ and set ψ ≤ 100; we have to consider the following situations:     (1) If n ≥ ψ, compare the channel coefficients of each relay node corresponding to matrices A and B, find the smaller of the two, and store the smaller value in the matrix S.
Sort the matrix elements S from largest to smallest, select the first m relay nodes with a larger C i value from them, and store them in the matrix T and C i � min C s,ri , C ri,d , where C i is the smaller value of the channel coefficient of the two channels corresponding to the relay node r i .
where r i is one of the first melements in the matrix Safter sorting. e value of m largely depends on the number of candidate relay nodes n, (m/n) the larger the value, the lower the bit error rate. Bit error rate refers to the index of the accuracy of data transmission within a specified time.
where SER is bit error rate, SER te is the bit errors in transmission, and SER T is the total number of codes transmitted. We add the two channel coefficients of these m relay nodes, and the relay node with the largest sum is the optimal relay node as follows: R � ri|max C s,ri + C ri,d , i � 1, 2, . . . , m .
(2) Otherwise, when the number of candidate relay nodes is less than the threshold, we must pay attention to the accuracy of being selected as the optimal relay node; calculate the sum of the channel coefficients of the channel corresponding to each relay node and the relay node with the largest sum, which is the optimal relay node.
Based on the above definition, we propose an efficient data transmission algorithm EDCD and the algorithm steps are as follows: Step 1: calculate the encounter probability of node m and node n, construct the encounter probability matrix, and use the number of encounters between nodes to weight each edge.
Step 2: define node similarity, which is composed of network topology attributes and social attributes. Network topology attributes are composed of the strength of node, the direct connection strength, and the indirect connection strength. Social attributes are composed of the geographic relevance of nodes, the interesting relevance of nodes, and the separating time relevance of nodes.
Step 3: use a hierarchical clustering algorithm to divide the community and lead in the modularity Q. e modularity is used to measure the degree of community division. And the fast unfolding algorithm is used to calculate the node similarity to update the network weight comprehensively.
Step 4: from the perspective of minimizing the bit error rate, after the community is divided into a multihop wireless network, construct a set of candidate relay nodes based on the relationship between the information forwarder and adjacent nodes and select the optimal relay node from the set of candidate relay nodes to undertake the transmission task. Analyze the channel coefficients of the channels from the source node to the relay node and the relay node to the destination node, and select the AF protocol as the relay node forwarding method for routing and forwarding.
To enhance the understanding and readability of the entire algorithm, the specific calculation flowchart of the EDCD algorithm is shown in Figure 5. Algorithm 1 gives the initialization and community establishment phase of the proposed algorithm, and Algorithm 2 presents the routing and forwarding phase of the proposed algorithm.

Simulation and Analysis
To assess the performance of the EDCD, we use a simulation tool called ONE (Opportunistic Network Environment) [41] and we compare with the following four typical routing algorithms.
Spray and wait [29]: this algorithm sprays the copies to the network and waits for these nodes to reach the destination node. e number of copies of the algorithm will affect performance, reduce the message delivery success rate, and increase the delivery delay. SCR (Social-based Clustering and Routing Scheme) [36]: this algorithm is a useful measurement method of social relations between nodes in mobile opportunistic network, and is a novel social-based clustering and routing scheme. SECM (status estimation and cache management) [42]: the algorithm uses state estimation and cache management methods to identify surrounding neighbors to evaluate the transmission probability between nodes, to ensure that they have high transmission, and to achieve the purpose of adjusting the cache.
EIMST (effective information transmission based on socialization nodes) [2]: the algorithm is based on social nodes to achieve effective information transmission. According to the defined stop time, when t < h , the node forwards the message with the most excellent probability, and when t > h , the node stops sending the message.
Download the real datasets from the network repository to experiments. According to the data information required for data transmission in opportunistic social networks, and choose pages-government [43], wiki-elec [44], advogato [45], and slashdot [46] four datasets for simulation experiments. e characteristic information of the four experimental datasets is shown in Table 1.
In the simulation experiment, we set the following metrics according to the characteristics of data transmission.
e EDCD algorithm and the other four algorithms run in 10 Complexity the same simulation environment to compare their performance.
(1) Delivery ratio: probability of choosing a suitable node as the next-hop node, represented as follows: whereD receive is the number of messages received by the destination node and D send is the total number of sent messages.
(2) Routing overhead indicates the overhead between nodes when transmitting information, represented as follows: whereR sum is the total time of the transmission between nodes and R suc is the time to transmit a successful message between nodes. (3) Average end-to-end delay: express the delay in selecting the optimal next hop. Complexity where D sum is the total delay of per node and D suc is the total number of nodes successfully receiving messages.
e correlation between the time and delivery ratio in four different real datasets is shown in Figures 6-9. Figure 6 shows the delivery ratio of spray and wait, SCR, SECM, EIMST, and EDCD algorithms in pages-government dataset. We can infer that when the simulation time is less than one day, the advantages of the algorithm EDCD are not apparent in the four real datasets. However, as the simulation time increases, we can find that the transmission rate of the EDCD algorithm is always bigger than other algorithms. EDCD algorithm divides the community by node similarity, and the effective nodes in the community carry out data transmission, so the data delivery ratio is better than the other four algorithms. e relationship between the delivery ratio and the simulation time in wikielec dataset is shown in Figure 7.
e SCR algorithms deliver information to nodes, and the community by using the flooding method leads to mass information missing. e delivery ratio of SECM is 0.65-0.78. EIMST and EDCD algorithm's delivery ratio is higher than the other. EIMST algorithm controls the time interval of delivery information that improves the transmission and receiving of effective information, and its delivery ratio reached 0.66-0.81. Due to the adoption of the EDCD algorithm combining network topology and social attributes, the algorithm's transmission rate is the highest among all algorithms, reaching 0.67-0.84. e correlation between the delivery ratio and simulation time in advogato dataset is shown in Figure 8. We see that the algorithm with the highest delivery rate is the EDCD algorithm, reaching 0.85-0.88. e spray and wait algorithm uses flooding to transmit information at community nodes, a large amount of information is lost, and the delivery rate is the lowest, only 0.67-0.70. Figure 9 shows the relationship between time and delivery ratio in slashdot dataset. e dataset with the largest number of nodes in the four datasets is slashdot dataset. When the simulation time is less than one and a half days, each dataset's delivery ratio is rising sharply, and the time is up to three days; only the EDCD and EIMST algorithms' delivery ratio is rising.

Complexity 13
EDCD algorithm, the delivery ratio is 0.76 on average, which is higher than the other algorithms. e correlation between the time and routing overhead in four different real datasets is shown in Figures 10-13. e comparison of the routing overhead between these five different algorithms in pages-government dataset is shown in Figure 10. e average routing overhead of the EDCD algorithm is always kept to the lowest. e algorithm uses the node similarity to divide the community and uses the optimal relay node strategy to forward information. e routing overhead of the EDCD algorithm is maintained between 40 and 65.  Figure 11 shows the association between routing overhead and time in wiki-elec dataset. In the spray and wait algorithm, redundant message group copies require a lot of time and resources, which is the main reason for the vast routing overhead. In the SCR algorithm, each node only forwards a copy of the message to the node with the destination node as a cluster member, ignoring the current availability of the next-hop node, which will cause overhead. In the SECM algorithm, because the node injects many redundant data, the overhead will be large. In the EIMST algorithm, information and buffer space can be effectively managed, but it consumes some unavailable node resources. In terms of routing overhead, EDCD always performs best among these five algorithms. Figure 12 shows the relationship between time and routing overhead in advogato dataset. Compared with other algorithms, EDCD algorithms select the optimal relay node and set up the weight distribution between nodes and community to reduce the overhead cost. Regarding the spray and wait algorithms, a lot of redundant information use lot of computing resources. For SCR and SECM algorithms, the cooperation mechanism is conducive to the reasonable allocation of computing resources, so the cost of these two algorithms is in the middle level. EIMST does not fully consider the transmission preference of nodes, so its performance is worse than that of EDCD algorithm. e relationship between routing overhead and time in slashdot dataset is shown in Figure 13. From the chart, we can see that the routing overhead increases sharply at first, nearly stably by the time it reaches the third day. e routing overhead of the spray and wait algorithm increases dramatically; a large number of data copies are generated in slashdot dataset with a large number of nodes, and these need to be processed, so the routing overhead is higher than other algorithms. e association between the time and average end-toend delay in four different real datasets is shown in Figures 14-17. e relationship between the average end-to-  end delay and time of each algorithm in pages-government dataset is shown in Figure 14. Compared with the other four algorithms, the EDCD algorithm has the lowest average endto-end delay.
Since the EDCD algorithm proposes a strategy for dividing communities by analyzing the comprehensive characteristics of nodes, it can reduce inefficient nodes that are not helpful to the transmission process, reducing the average end-to-end delay. e spray and wait algorithm has more message copies, which will cause corresponding delays. e SCR algorithm effectively forwards the copy of the message to the destination node, so the transmission delay is lower than the spray and wait algorithm. SECM algorithm will also increase the cache of node before data transmission, so there will be a corresponding delay. Figure 15 shows the association between routing overhead and time in wiki-elec dataset. We can see that the EIMST algorithm's delay is higher than that in other datasets but lower in the rest of the datasets. Because the EIMST algorithm applies node based on information management, there are more nodes in the wiki-elec dataset, and the delay increases as the simulation time increases. In short, the average end-to-end delay of the EDCD algorithm in wikielec dataset is lower than the other four algorithms. Figure 16 shows the relationship between average end-toend delay and time in advogato dataset. To be specific, spray and wait algorithm's maximum delay could reach 95 because this method remarkably increased routing and message forwarding delays. e SCR and SECM algorithms have lower delays than the spray and wait algorithm because both algorithms effectively controlled a lot of message copies. Besides, the SCR algorithm implemented community division and information management. In contrast, the SECM algorithm effectively utilized the cooperation mechanism between nodes to utilize the nodes' cache space reasonably to reduce the delay in the message forwarding process. e average end-to-end delay of the EIMST algorithm was also significantly lower than the other algorithms.   Figure 17 shows the correlation between the average end-toend delay and time in slashdot dataset. In a dataset with many nodes, we can see in the figure that the average end-toend delay of the EIMSTalgorithm is significantly higher than other datasets. at is why the EIMST algorithm implements community detection. However, the effect is general when processing large amounts of data. e algorithm EDCD proposed in this paper has a lower latency in different real datasets than other algorithms.

Conclusions
In this study, an effective data transmission scheme in opportunistic social networks that uses mobile edge computing combined with network topology attributes and social attributes to measure node similarity to divide communities and select the optimal relay node. is algorithm is mainly based on the idea that the closeness between nodes in the community is higher than that exterior in the community and provides a method for selecting the optimal relay node according to the sum of channel coefficients in the process of transmitting information. e simulation experiment results show that the strategy has good performance in different real datasets such as delivery ratio, routing overhead, and average end-to-end delay. e EDCD algorithm can be used to the 5G data transmission scene and can cope with the challenges of stability and continuity required by data in the interactive process through efficient community division and information transmission. In future work, we will enhance the related performance of the algorithm and will further study the security of data transmission in opportunistic social networks.

Data Availability
e data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 12 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.