Defending against Online Social Network Rumors through Optimal Control Approach

. Rumors have been widely spread in online social networks and they become a major concern in modern society. This paper is devoted to the design of a cost-eﬀective rumor-containing scheme in online social networks through an optimal control approach. First, a new individual-based rumor spreading model is proposed, and the model considers the inﬂuence of the external environment on rumor spreading for the ﬁrst time. Second, the cost-eﬀectiveness is recommended to balance the loss caused by rumors against the cost of a rumor-containing scheme. On this basis, we reduce the original problem to an optimal control model. Next, we prove that this model is solvable, and we present the optimality system for the model. Finally, we show that the resulting rumor-containing scheme is cost-eﬀective through extensive computer experiments.


Introduction
Nowadays, with the rapid development of Internet technology, online social network (OSN), ranging from Facebook and Twitter to YouTube and LinkedIn, has become a popular platform for people to communicate [1,2]. On the negative side, harmful rumors can spread rapidly over OSNs, leading to huge losses [3,4]. In 2013, a false tweet claimed that Barack Obama was injured in an explosion which resulted in a loss of 130 billion US dollars in stock value [5]. In 2015, the rumor of shootouts and kidnappings by drug gangs happening near schools in Veracruz caused severe chaos in the city [6]. e outbreak of rumors has brought many problems, some of which pose threats to our society. erefore, how to effectively restrain the propagation of rumors over OSNs has been a research hotspot in the field of cybersecurity.
In order to protect the security of cyberspace, it is urgent to propose some effective measures to control the spread of rumors. In recent years, researchers have suggested many measures to mitigate the impact of rumors, such as blocking or isolating some OSN users to prevent them from further spreading rumors to OSN and releasing convincing rumorcontaining messages to OSN users. ese measures have been proven to be effective in controlling the spread of rumors [3,7]. In practice, almost all rumor-containing measures will consume resources (such as money and manpower) during the implementation process, resulting in a certain cost. e cost is generally borne by the OSN platform and government. Since the budget of the OSN platform and government for controlling rumor is limited, it is necessary to study the rumor-containing problem from an economic perspective. However, most of the related research works only discussed the effectiveness (or performance) of rumor-containing measures, and they all ignored the cost of implementing the measures. Different from previous research perspectives, this paper studies rumor-containing problem from an economic perspective. is work not only considers the losses caused by the spread of rumors but also considers the cost of implementing rumor-contain measures. In this study, we use the total loss caused by rumor to characterize cost-effectiveness. If the total loss is smaller, the cost-effectiveness is better; otherwise, the cost-effectiveness is worse. Based on the definition of cost-effectiveness, costeffectiveness can be maximized only when the total cost is minimized. is paper will try to find such a rumor-containing scheme that maximizes cost-effectiveness.
From the above discussion, it is noticeable that the rumor-containing agency faces the following challenging problem: Rumor-containing problem: supposing that a rumor is spreading over an OSN, design a cost-effective rumorcontaining scheme.
Designing a cost-effective rumor-containing scheme is a valuable research problem. ere are new rumors appearing in OSNs all the time. However, due to the fact that there is no one approach that can completely control the spread of rumors, the rumor-containing problem is essentially a management problem that requires continuous investment of resources. In practice, the budget of the rumor-containing agency is limited. If the economic costs incurred in the process of controlling rumors cannot be managed well, it will be difficult to continuously implement measures to control the spread of rumors.
In this paper, we propose a novel individual-based rumor spreading model, where the effect of the external environment on the spread of the rumor is accounted for. ereby, we estimate the cost-effectiveness of a rumor-containing scheme. On this basis, we reduce the rumor-containing problem to an optimal control model, where each control stands for a rumor-containing scheme, and the objective functional stands for the cost-effectiveness of a rumor-containing scheme. We prove that this model admits an optimal control, guaranteeing the solvability of the model. We derive the optimality system for the model, which can be used to solve the model. rough extensive comparative experiments, we show that the rumorcontaining scheme so obtained is cost-effective.
is paper makes a theoretical study on rumor-containing problem, and the research results can provide some theoretical guidance for taking measures to suppress the spread of rumors. In addition, the research method proposed in this paper can be applied to analyze the cost management of the rumor containment process, and the new model proposed in this paper can be used to study the influence of some parameters on the spread of rumors. Finally, new research ideas can also be extended to other cyberspace security problems, such as malware propagation [8].
e subsequent materials are organized in this fashion: Section 2 reviews the related work. Section 3 establishes an optimal control model of the rumor-containing problem. Sections 4 and 5 solve the model. is work is summarized in Section 6.

Related Work
is section is devoted to reviewing the previous work that is related to the present paper. First, rumor-containing problem is introduced. Second, some rumor spreading models are discussed. ird, the optimal control approach used to deal with rumor-containing problem is introduced.
Rumor-containing problem is devoted to finding effective strategies to control or limit the propagation of rumor in a network, so that the losses caused by rumor can be reduced. In recent years, rumor-containing problem has received significant attention of researchers. Toward this direction, there exist two main types of rumor-containing strategies [3,9], that is, (a) preventing most influential users or community bridges from being affected by the rumor and (b) spreading convincing messages to clarify the rumor. In the past few years, there has been quite a lot of research on the first type of strategies (also called rumor blocking strategies) [10][11][12][13][14][15]. e idea of this type of strategies is to find a small set of users or community bridges in the OSN, such that isolating or protecting them will minimize the impact of rumor propagation. However, the rumor-containing strategy ultimately boils down to solving a NP-hard problem and therefore an exact solution is infeasible for large-scale OSNs. Although many heuristic algorithms have been proposed to deal with the problem, they are still too costly for large-scale OSNs. In addition, some isolating or protecting measures may violate human rights. Recently, the second type of rumor-containing strategy has attracted a lot of attention [3,9]. e essence of the strategy is to model rumor-containing problem as a competitive propagation problem between anti-rumor information and rumor, and this type of strategies has been shown to be effective means of restraining rumor in OSNs [3,9,16,17]. In practice, the above two types of rumor-containing strategies are both effective. However, researchers only studied their effectiveness in controlling rumors but did not consider the cost issues involved in the implementation.
Modeling the spreading process of rumor lays a theoretical basis for studying rumor-containing problem. Existing rumor spreading models can be classified into three categories: compartmental models, network-degree models, and individual-based models. Compartmental rumor spreading models are only suited to homogeneous rumor spreading networks [18][19][20][21][22], and network-degree rumor spreading models only apply to some special types of networks such as scale-free networks [16,[23][24][25][26]. In contrast, individual-based rumor spreading models are applicable to all rumor spreading networks [17,[27][28][29][30]. e rumor spreading models proposed in [16][17][18][19][20][21][22] are all based on the assumption that a rumor can only be received through OSNs. However, in practice, OSN users can also receive rumors from the external environment such as TV programs or tabloid reports [31]. erefore, previous work may underestimate the propagation ability of rumors. Hence, it is necessary to introduce a rumor spreading model in which the effect of the external environment is accounted for. For our purpose, in the present paper, we aim to establish such an individual-based rumor spreading model.
In recent years, optimal control theory has been applied to deal with rumor-containing problem. Optimal control theory is devoted to finding a control scheme for a dynamical system so that a certain optimality criterion is met [32]. Optimal control has been applied to a variety of areas 2 Discrete Dynamics in Nature and Society such as malware containment [33,34] and cybersecurity [35]. Based on network-degree rumor spreading models, the rumor-containing problem has been dealt with through optimal control approach [36][37][38][39]. Recently, this methodology has been extended to individual-based rumor spreading models. e authors of [30] suggested an isolation-conversion mechanism of restraining rumors. Owing to violation of human rights, the mechanism in [30] may be impracticable. e authors of [17] introduced a rumorcontaining message-pushing mechanism, which has two defects: (1) the effect of the external environment on the spread of the rumor was neglected at all and (2) the messagepushing rate function was regarded as a rumor-containing scheme. In practice, this function may not be under direct control of the rumor-containing agency.
In the present paper, we deal with the rumor-containing problem through optimal control approach but from a more practical perspective. First, we consider the influence of external environment on rumor spreading and propose a new rumor spreading model. Second, we regard the growth rate function of the rumor-containing cost as a rumor-containing scheme, and we define the cost-effectiveness of a rumor-containing scheme. Finally, we modeled the rumor-containing problem as the problem of finding the most cost-effective rumor-containing scheme, and we solve the problem by applying the optimal control theory.

The Modeling of the Rumor-Containing Problem
is section is devoted to the modeling of the rumorcontaining problem. Based on a novel individual-based rumor spreading model, we measure the cost-effectiveness of a rumor-containing scheme. On this basis, we reduce the rumor-containing problem to an optimal control model.

A Rumor Spreading Model.
Consider an OSN of N users denoted u 1 through u N . Let G net � (U, E) denote the topological structure of the OSN, i.e., the node set U � u 1 , . . . , u N , and each edge (u i , u j ) ∈ E stands for the fact that the users u i and u j are mutual OSN friends. Let A � (a ij ) N×N denote the adjacency matrix of G net ; that is, Suppose a rumor is spreading over the network G net . In order to mitigate the impact of the rumor, a rumor-containing agency must collect rumor-containing evidence through continuous investment in a prescribed time horizon [0, T]. For 0 ≤ t ≤ T, let C(t) denote the cumulative rumor-containing cost in the time horizon [0, t]. en (dC(t)/dt) stands for the growth rate of the rumor-containing cost at time t. We refer to the function G defined by G(t) � (dC(t)/dt), t ∈ [0, T], as a rumor-containing scheme. Obviously, this scheme is under control of the rumor-containing agency.
For ease in realization, we assume that all feasible rumorcontaining schemes are Lebesgue integrable [40]. Additionally, based on sociological evidence, it can be concluded that 0 ≤ t ≤ T. By combining the above discussions, we get that the set of all feasible rumor-containing schemes is where L[0, T] denotes the set of all Lebesgue integrable functions defined on the interval [0, T]. Combined with sociological evidence and rational analysis, we can know that at any time t ∈ [0, T] each network user is either rumor-uncertain, rumor-believing, or rumorrefusing. Rumor-uncertain means that a user's attitude toward rumor is uncertain. Rumor-believing means that a user believes in a rumor. Rumor-refusing means that a user does not believe in a rumor. Let O i (t) � 0, 1, and 2 stand for the fact that the user u i is rumor-uncertain, rumor-believing, and rumor-refusing at time t, respectively. We refer to O i (t) as the state of the user u i at time t, and the vector as the state of the network at time t. e network state evolves over time. In order to describe the evolutionary process of the network state, let us introduce the following notations: (i) β 1 (resp., β 2 ): the probability with which, owing to the influence of a rumor-believing OSN friend, a rumor-uncertain (resp., rumor-refusing) user becomes rumor-believing at any time, and β 1 , β 2 > 0. (ii) α 1 (resp., α 2 ): the probability with which, owing to the influence of the external environment, a rumoruncertain (resp., rumor-refusing) user becomes rumor-believing at any time, and α 1 , α 2 ≥ 0. (iii) c 1 (resp., c 2 ): the probability with which, owing to the influence of a rumor-refusing OSN friend, a rumor-uncertain (resp., rumor-believing) user becomes rumor-refusing at any time, and c 1 , c 2 > 0. (iv) δ: the probability with which, owing to the limited memory, a rumor-believing or rumor-refusing user becomes rumor-uncertain at any time, and δ > 0. (v) θ 1 (G) (resp., θ 2 (G)), G∈ [0, ∞): the probability with which, owing to the growth rate G of the rumor-containing cost, a rumor-uncertain (resp., rumor-believing) user becomes rumor-refusing. Obviously, θ 1 (0) � θ 2 (0) � 0, θ 1 and θ 2 are strictly increasing.
In practice, the first seven parameters can be estimated through online questionnaire survey, and the last two functions can be approximated through regression based on historical data. In particular, we introduce the parameter α 1 to characterize the effect of the external environment on the spread of rumors, and α 1 � 0 refers to the scenario that does not consider the influence of external environment; this scenario has been studied by many researchers [16][17][18][19][20][21][22][23][24][25][26][27][28][29][30]. In this paper, we will consider a more general case, i.e., α 1 > 0. Let Let U i (t), B i (t), and R i (t) denote the probability that the user u i is rumor-uncertain, rumor-believing, and rumorrefusing at time t, respectively: as the expected state of the network at time t. Let E 0 � E(0) denote the initial expected network state.

Theorem 1.
e evolutionary process of the expected network state is described by the following system of ordinary differential equations: Proof. Let E(·) denote the mathematical expectation of a random variable. en, the rumor-uncertain user u i becomes rumor-believing at time t at the expected rate the rumor-refusing user u i becomes rumor-believing at time t at the expected rate the rumor-believing user u i becomes rumor-refusing at time t at the expected rate and the rumor-believing user u i becomes rumor-uncertain at time t at the expected rate δ. e first N equations in system (5) follow. e last N equations in the system can be derived analogously. System (5) is a novel individual-based rumor spreading model, in which the effect of the external environment is accounted for. For 1 ≤ i ≤ N and 0 ≤ t ≤ T, let For en, model (5) is abbreviated as Next, we show that model (5) is positively invariant.
In what follows, let stand for the solution to model (5).

e Optimal Control Modeling of the Rumor-Containing
Problem. For our modeling purpose, we need to estimate the total loss caused by rumor, and then we will define costeffectiveness. To this end, let w denote the average loss per unit time of a rumor-believing user. In practice, w can be estimated by assessing the potential consequence of the rumor under consideration.
Proof. Let dt > 0 denote an infinitesimal. e average loss of the user u i in the time range [t, t + dt) is wdt or zero according to whether he is rumor-believing at time t or not. Hence, his expected loss in the time range [t, t + dt) is wB i (t, G(t))dt. Equation (13) follows.

Theorem 4. e rumor-containing cost in the time range
Proof. Let dt > 0 denote an infinitesimal. e rumor-containing cost in the time range [t, t + dt) is G(t)dt. Equation (14) follows. Based on eorems 3 and 4, for a rumor-containing scheme G, the total loss caused by a rumor can be measured by the quantity In this paper, we use the quantity J(G) to characterize cost-effectiveness, and the smaller the quantity J(G) is, the more cost-effective the rumor-containing scheme will be. In practice, we hope to achieve maximum cost-effectiveness by minimizing the quantity J(G).
Based on the above discussions, we model the rumorcontaining problem as the following optimal control problem: Here, F(E(t), G(t)) � w N i�1 B i (t, G(t)) + G(t). We refer to the optimal control problem as the rumorcontaining model. Each instance of the model is given by the 14-tuple as follows: Discrete Dynamics in Nature and Society 5

Dealing with the Rumor-Containing Model
is section aims to deal with the rumor-containing model (16). First, we show that the model is solvable. Second, we derive the optimality system for the model.

Solvability of the Rumor-Containing Model.
e following lemma is a direct corollary of a theorem in [32].

Lemma 1.
e rumor-containing model (16) admits an optimal control if the following six conditions hold simultaneously: (5)

is solvable. (4) f(E, G) is bounded by a linear function in E. (5) F(E, G) is convex on G.
(6) ere exist ρ > 1, d 1 > 0, and d 2 such that We are ready to show the solvability of the rumor-containing model.
Proof. First, let G * be an accumulation point of the set G.
en, there is a sequence of points, G 1 , G 2 , . . ., in G that approaches G * . On one hand, G * ∈ L[0, T] follows from the completeness of L[0, T]. On the other hand, e closeness of G is proven. Second, let G 1 , G 2 ∈ G, 0 < σ < 1, and G * � (1 − σ)G 1 + σC 2 . On one hand, G * ∈ L[0, T] follows from the fact that L[0, T] is a vector space. On the other hand, it is obvious that G * (t) ≤ G, 0 ≤ t ≤ T. e convexity of G is proven.
irdly, the solvability of the system (dE(t)/dt) � f(E(t), G), 0 ≤ t ≤ T, follows from the continuous differentiability of the function f(E, G).
Next, the convexity of F(E, G) on G follows from its linearity on G. Finally, we have F(E, G) ≥ 0 ≥ G 2 − G 2 .
Hence, the claim follows from Lemma 1.
e optimality system can be solved by invoking the well-known Forward-Backward Euler Method [42]. We refer to the control obtained in this way as a promising rumor-containing scheme for model (16). is is because the scheme may be optimal in terms of cost-effectiveness.

The Cost-Effectiveness of the Promising Rumor-Containing Scheme
At the end of the previous section, we proposed the notion of promising rumor-containing scheme. In this section, we assess the cost-effectiveness of this scheme through comparative experiments.

Experiment Design.
In each of the following experiments, we conduct the following operations: (1) generate an instance of the rumor-containing model (16), (2) obtain a promising rumor-containing scheme for the instance by invoking Forward-Backward Euler Method, and (3) compare this scheme with a set of static rumor-containing schemes in terms of the cost-effectiveness. All the following experiments are carried out on a PC with Inter ® Core ™ i5-7500 CPU @ 3.40 GHz and 8 GB RAM. Studies show that some social platforms such as Facebook [43], Twitter [44], and YouTube [45] have provided a way for rumors to generate and spread. To simulate the environment in which rumors spread, we choose three realworld OSNs. First, consider the Facebook network and the Twitter network provided by SNAP, the well-known network library [46]. Due to memory limitation, we randomly choose a subnet of the original network without loss of generality. Choose a 100-node subnet of the original Facebook network (dataset name: ego-Facebook), denoted by G F , and a 100-node subset of the original Twitter network (dataset name: ego-Twitter), denoted by G T , respectively. Second, consider the YouTube network in Network Repository [47]. Choose a 100-node subnet of the YouTube network, denoted by G Y . Figure 2 displays these three networks.
e cost-effectiveness is the focus in this section. Common OSN platforms generally have information security agencies. When rumors break out on OSN, the agency Discrete Dynamics in Nature and Society is responsible for collecting and disseminating truths to dispel rumors, for example, when President Trump declared on Twitter that mail voting would lead to a "rigged election." In order to control the spread of the rumor, Twitter tagged Trump's tweets with the label "Getting the facts about mailing votes" and redirected users to a fact-checking page to provide comprehensive investigation information for the misleading article. In practice, the agency's budget is limited. In order to control the spread of rumors, the agency needs to find a cost-effective rumor control scheme; that is, the scheme can minimize the total loss caused by rumors.

Experimental Results
Experiment 1. For the rumor-containing model M � (G, β 1 , β 2 , α 1 , . . . , θ 2 , T, E 0 ), α 1 � 0 stands for ignoring the influence of external environment on the propagation of a rumor. Consider three instances as follows: where . As shown in Figure 3, compared with the situation that does not consider the effect of external environment (i.e., α 1 � 0), increasing α 1 will cause the value of B(t) to increase, especially when t is relatively small. e finding indicates that external environment has a great effect on the expected probability of rumor-believing users in OSNs, especially in the early stages of the spread of rumors, and if we ignore the effect of external environment, the propagation ability of rumors will be seriously underestimated.
In practice, the closed-form formula for the functions θ 1 and θ 2 can be approximated through regression based on historical data, and both θ 1 and θ 2 are monotonically increasing functions. Supposing that there is a rumor spreading on the network G F , we consider three different forms of θ 1 and θ 2 , and the experimental settings are as follows.

Experiment 2.
Consider three instances of the rumor-containing model: where 1 (x) � (1.5x/4 + x), and θ 3 2 (x) � (x/4 + x). Let G p denote the promising rumor-containing scheme. By applying the approach introduced in Section 4.2, we get a promising control G p , which is shown in Figure 4. It is seen that the promising rumor-containing scheme G p of the three instances first stays at the maximum allowable rate and then drops to the zero rates.
Furthermore, we compare the cost-effectiveness between the promising control strategy G P and a group of static control strategies A � G α : 0, 0.1, 0.2, . . . , 1.0 . e comparison result can be found in Figure 5. It is seen that J(G P ) < J(G α ), G α ∈ A. e result shows that our proposed rumor-containing scheme G p obtains the highest cost-effectiveness; hence, it performs much better than all the static rumor-containing schemes.
Similarly, supposing that there is a rumor spreading on the network G T , we consider three different forms of θ 1 and θ 2 , and the experimental settings are as follows. Experiment 3. Consider three instances of the rumor-containing model: where 1 (x) � (0.8x/1 + x), and θ 3 2 (x) � (0.6x/1 + x). Let G p denote the promising rumor-containing scheme. By applying the approach introduced in Section 4.2, we get a promising control G p , which is shown in Figure 6. It is seen that the promising rumor-containing scheme G p of the three instances first stays at the maximum allowable rate and then drops to the zero rates.
Furthermore, we compare the cost-effectiveness between the promising control strategy G P and a group of static control strategies A � G α : 0, 0.1, 0.2, . . . , 1.0 . e comparison result can be found in Figure 7. It is seen that J(G P ) < J(G α ), G α ∈ A. e result shows that our proposed 8 Discrete Dynamics in Nature and Society rumor-containing scheme G p obtains the highest cost-effectiveness; hence, it performs much better than all the static rumor-containing schemes. Again, supposing that there is a rumor spreading on the network G Y , we consider three different forms of θ 1 and θ 2 , and the experimental settings are as follows.
Based on the results of Experiments 2-4, we can draw some conclusions as follows: (a) if we do not take any rumorcontaining scheme, the spread of rumors will cause great losses, and (b) the proposed rumor-containing scheme can greatly mitigate the impact of rumor and performs much better than all the static rumor-containing schemes in terms of the cost-effectiveness. Apart from the above three experiments, we conduct 1,000 similar experiments as well. In all these experiments, we obtain similar and consistent results.
erefore, we conclude that the promising rumorcontaining scheme is cost-effective.  Figure 6: e promising rumor-containing scheme G p for three pairs of (θ 1 (x), θ 2 (x)) functions:    Figure 5: A comparison between the three promising controls G P in Figure 2 and the set of static controls A in terms of total cost.

Concluding Remarks
In this study, we have studied the problem of developing a cost-effective rumor-containing scheme. Based on a nodelevel rumor spreading model that takes account of the effect of external environment, we have measured the impact of rumors. On this basis, we have modeled the rumor-containing problem as an optimal control problem. e optimization goal of the problem is to find a rumor-containing scheme that minimizes the total loss, and simulation results show that the proposed scheme is cost-effective. is work has studied the propagation of rumor from theoretical modeling and cost management perspectives. e research results can provide some theoretical guidance for taking measures to suppress the spread of rumors. e new model proposed in this paper can be used to study the influence of some parameters on the spread of rumors, and the research ideas can also be applied to study other cyberspace security problems. e spread of rumors in the real world may be more complicated, and there are some open problems. First, since there is more than one OSN in the real world [48,49], the rumor-containing problem should be extended to multiplex OSNs. Second, since realistic OSNs are varying over time [50,51], it is necessary to study the rumor-containing problem with dynamic OSNs. irdly, in some application scenarios, the spread of a rumor can be captured by a system of partial differential equations [52][53][54]; it is worth adapting this work to these situations. Next, it is necessary to apply our methodology to some other areas such as disease spreading [55,56], malware propagation [57,58], and cybersecurity [59,60]. Finally, in this work, the rumormonger is assumed to be nonstrategic. In practice, however, the rumormonger may well be strategic. In this situation, the rumor-containing problem should be treated in the framework of game theory [61][62][63].

Data Availability
e data used to support the findings of this study are available, and the sources of the datasets have been given in the paper.

Conflicts of Interest
e authors declare that there are no conflicts of interest.

Authors' Contributions
Da-Wen Huang contributed to formal analysis, software, and roles/writing of the original draft. Lu-Xing Yang contributed to formal analysis, investigation, and validation. Xiaofan Yang contributed to supervision, methodology, and writing, reviewing, and editing of the paper. Yuan Yan Tang contributed to supervision and writing, reviewing, and editing of the paper. Jichao Bi contributed to writing, reviewing, and editing of the paper.  Figure 8: e promising rumor-containing scheme G p for three pairs of (θ 1 (x), θ 2 (x)) functions: (a) θ 1 (x) � θ 1 1 (x) and θ 2 (x) � θ 1 2 (x), (b) θ 1 (x) � θ 2 1 (x) and θ 2 (x) � θ 2 2 (x), and (c) θ 1 (x) � θ 3 1 (x) and θ 2 (x) � θ 3 2 (x).  Figure 9: A comparison between the three promising controls G P in Figure 6 and the set of static controls A in terms of total cost. Discrete Dynamics in Nature and Society