Defending against the Advanced Persistent Threat : An Optimal Control Approach

The new cyberattack pattern of advanced persistent threat (APT) has posed a serious threat tomodern society.This paper addresses the APT defense problem, that is, the problem of how to effectively defend against an APT campaign. Based on a novel APT attackdefense model, the effectiveness of an APT defense strategy is quantified. Thereby, the APT defense problem is modeled as an optimal control problem, in which an optimal control stands for a most effective APT defense strategy.The existence of an optimal control is proved, and an optimality system is derived. Consequently, an optimal control can be figured out by solving the optimality system. Some examples of the optimal control are given. Finally, the influence of some factors on the effectiveness of an optimal control is examined through computer experiments.These findings help organizations to work out policies of defending against APTs.


Introduction
Nowadays, the daily operation of most organizations, ranging from large enterprises and financial institutions to government sectors and military branches, depends largely on computers and networks.However, this dependency renders the organizations vulnerable to a wide range of cyberattacks.Traditional cyberattacks include computer viruses, worms, and spyware.Conventional cyber defense measures including firewall and intrusion detection turn out to be effective in withstanding these cyberattacks [1,2].
The cybersecurity landscape has changed drastically over the past few years.A new type of cyberattack-advanced persistent threat (APT)-has posed an unprecedentedly serious threat to modern society.According to report, many highprofile organizations have experienced APTs [3], and the number of APTs has been increasing rapidly [4].Compared with traditional cyberattacks, APTs exhibit two distinctive characteristics: (a) The attacker of an APT is a well-resourced and well-organized group, with the goal of stealing as many sensitive data as possible from a specific organization.(b) Based on meticulous reconnaissance, the attacker is going to launch a preliminary advanced social engineering attack on a few target users to gain footholds in the organization and then to gain access to critical information stealthily and slowly [5][6][7].Due to these characteristics, APTs can evade traditional detection, causing tremendous damage to organizations.To date, the detection of APTs is far from mature [8,9].Consequently, the APT defense problem, that is, the problem of how to effectively defend against APTs, has become a major concern in the field of cybersecurity.
As a branch of applied mathematics, optimal control theory aims to solve a class of optimization problems in which, subject to a set of dynamic constraints, we seek to find a function (control) so that an objective functional is optimized [10,11].In real world applications, the set of dynamic constraints represents a dynamic environment, a control represents a time-varying strategy, and the objective functional represents an index to be maximized or minimized.Optimal control theory has been successfully applied to some aspects of cybersecurity [12][13][14][15][16][17][18][19].To our knowledge, the APT defense problem has yet to be addressed in the framework of optimal control theory.To model the problem as an optimal control problem, we have to formulate an APT defense strategy as a control, characterize the state evolution of an organization as a set of dynamic constraints, and quantify the effectiveness of an APT defense strategy as an objective functional.The key to the modeling process is to accurately characterize the state evolution of an organization by employing the epidemic modeling technique [20].
This paper focuses on the APT defense problem.Based on a novel individual-level APT attack-defense model, the effectiveness of an APT defense strategy is quantified.On this basis, the APT defense problem is modeled as an optimal control problem, in which an optimal control represents a most effective APT defense strategy.The existence of an optimal control to the optimal control problem is proved, and an optimality system for the optimal control problem is derived.Therefore, an optimal control can be figured out by solving the optimality system.Some examples of the optimal control are presented.Finally, the influence of some factors on the effectiveness of an optimal control is examined through computer simulations.To our knowledge, this is the first time the APT defense problem is dealt with in this way.These findings help organizations to work out policies of defending against APTs.
The remaining materials are organized in this fashion.Section 2 models the APT defense problem as an optimal control problem.Section 3 studies the optimal control problem.Some most effective APT defense strategies are given in Section 4. Section 5 discusses the influence of different factors on the optimal effectiveness.This work is closed by Section 6.

The Modeling of the APT Defense Problem
The goal of this paper is to solve the following problem.
The APT Defense Problem.Defend an organization against APTs in an effective way.
To achieve the goal, we have to model the problem.The modeling process consists of the following four steps.
Step 2. Establish an APT attack-defense model.
Step 3. Quantify the effectiveness of an APT defense strategy.
Step 4. Model the APT defense problem as an optimal control problem.Now, let us proceed by following this four-step procedure.
2.1.Preliminary Terminologies and Notations.Consider an organization with a set of  computer systems labeled 1, 2, . . ., .Let  = (, ) denote the access network of the organization, where (a) each node stands for a system, that is,  = {1, 2, . . ., }, and (b) (, ) ∈  if and only if system  has access to system .Let A = [  ] × denote the adjacency matrix for the network, that is,   = 1 or 0 according to (, ) ∈  or not.
Suppose an APT campaign to the organization starts at time  = 0 and terminates at time  = .Suppose at any time  ∈ [0, ] every node in the organization is either secure, that is, under the defender's control, or compromised, that is, under the attacker's control.Let   () = 0 and 1 denote the event that node  is secure and compromised at time , respectively.The vector stands for the state of the organization at time .Let   () and   () denote the probability of the event that node  is secure and compromised at time , respectively.That is, As   () +   () ≡ 1, the vector stands for the expected state of the organization at time .
From the attacker's perspective, each secure node in the organization is subject to the external attack.Let   denote the cost per unit time for attacking a secure node .The vector stands for an attack strategy.Additionally, each secure node is vulnerable to all the neighboring compromised nodes.
From the defender's perspective, each secure node in the organization is protected from being compromised.Let   () denote the cost per unit time for protecting the secure node  at time .The vector-valued function stands for a prevention strategy.Additionally, each compromised node in the organization is recovered.Let   () denote the cost per unit time for recovering the compromised node  at time .The vector-valued function stands for a recovery strategy.We refer to the vector-valued function as an APT defense strategy.

An APT Attack-Defense Model.
For fundamental knowledge on differential dynamical systems, see [49].For our purpose, let us impose a set of hypotheses as follows.
(H 1 ) Due to the external attack and prevention, a secure node  gets compromised at time  at the average rate   /  ().The rationality of this hypothesis lies in that the average rate is proportional to the attack cost per unit time and is inversely proportional to the prevention cost per unit time.
(H 2 ) Due to the internal infection and prevention, a secure node  gets compromised at time  at the average rate  ∑  =1     ()/  (), where  > 0 is a constant, which we refer to as the infection force.The rationality of this assumption lies in that the average rate is proportional to the probability of each neighboring node being compromised and is inversely proportional to the prevention cost per unit time.
(H 3 ) Due to the recovery, a compromised node  becomes secure at time  at the average rate   ().The rationality of this assumption lies in that the average rate is proportional to the recovery cost per unit time.
According to these hypotheses, the state transitions of a node are shown in Figure 1.Hence, the time evolution of the expected state of the organization obeys the following dynamical system: We refer to the model as the APT attack-defense model.
The APT attack-defense model can be written in matrixvector notation as

The Effectiveness of an APT Defense
Strategy.The defender's goal is to find the most effective APT defense strategy.To achieve the goal, we have to quantify the effectiveness of an APT defense strategy.For this purpose, let us introduce an additional set of hypotheses as follows.
(H 4 ) The prevention cost per unit time is bounded from above by  and from below by  > 0, and the recovery cost per unit time is bounded from above by  and from below by  > 0. That is, the admissible set of APT defense strategies is given by where  2 [0, ] denote the set of all the Lebesgue square integrable functions defined on the interval [0, ] [50].
(H 5 ) The amount of losses caused by a compromised node  in the infinitesimal time interval [,  + ] is   , where   = ∑  =1   stands for the out-degree of node  in the network.The rationality of this hypothesis lies in that the more nodes a node has access to, the more serious the consequence when it is compromised [51,52].
According to the hypotheses, the expected loss of the organization in the time horizon [0, ] when implementing an APT defense strategy u = [x, y] is and the overall cost for implementing the APT defense strategy is Hence, the effectiveness of the APT defense strategy u can be measured by the quantity Obviously, the smaller this quantity, the more effective the APT defense strategy.Let Then Security and Communication Networks

The Modeling of the APT Defense Strategy.
Based on the previous discussions, the APT defense problem boils down to the following optimal control problem: Here, each control stands for an APT defense strategy, the objective functional stands for the effectiveness of an APT defense strategy, the set of constraints stands for the time evolution of the expected state of the organization, an optimal control stands for a most effective APT defense strategy, and the optimal value stands for the effectiveness of a most effective APT defense strategy.

A Theoretical Analysis of the Optimal Control Problem
For fundamental knowledge on optimal control theory, see [10,11].This section is devoted to studying the optimal control problem (16).

The Existence of an Optimal Control.
As an optimal control to problem ( 16) represents a most effective APT defense strategy, it is critical to show that the problem does have an optimal control.For this purpose, we need the following lemma [11].
Lemma 1. Problem ( 16) has an optimal control if the following five conditions hold simultaneously.
(C 1 ) U is closed and convex.
(C 2 ) There is u ∈ U such that the adjunctive dynamical system is solvable.
Next, let us show that the five conditions in Lemma 1 indeed hold.

(18)
Lemma 4.There is u ∈ U such that the associated adjunctive dynamical system is solvable.
Theorem 8. Problem ( 16) has an optimal control.This theorem guarantees that there is a most effective APT defense strategy.

The Optimality System.
It is known that the optimality system for an optimal control problem offers a method for numerically solving the problem.This subsection is intended to present the optimality system for problem (16).For this purpose, consider the corresponding Hamiltonian where  = ( 1 , . . .,   )  is the adjoint.
Theorem 9. Suppose u * is an optimal control to problem (16) and C * is the solution to the adjunctive dynamical system with u = u * .Then, there exists  * with  * () = 0 such that, for 0 ≤  ≤ , 1 ≤  ≤ , Proof.According to the Pontryagin Minimum Principle [10], there exists  * such that Thus, the first  equations in the claim follow by direct calculations.As the terminal cost is unspecified and the final state is free, the transversality condition  * () = 0 holds.By using the optimality condition Combining the above discussions, we get the optimality system for problem (16) as follows. where Applying the forward-backward Euler scheme to the optimality system, we can obtain an optimal control to problem (16), that is, a most effective APT defense strategy.

Some Most Effective APT Defense Strategies
In this section, we give some most effective APT defense strategies by solving the optimality system (29).For ease in observation, let us introduce two functions as follows.For an admissible control u to problem ( 16 Obviously, we have CE(; u) = (u).
For some optimal control problems, let us give the cumulative effectiveness and superposed control for an optimal control.
Example 10.Consider problem (16) in which  is a scalefree network with  = 100 nodes which is generated by executing the algorithm given in [53],  = 20,  = 0.001,  =  = 0.1,  =  = 0.7,   = 0.1, 0 ≤  ≤ , and   (0) = 0.1, 0 ≤  ≤ .An optimal control to the problem is obtained by solving the optimality system (29).Figure 2 plots the cumulative effectiveness and superposed control for the optimal control.For comparison purpose, the cumulative effectiveness and superposed control for three admissible static controls are also shown in Figure 2.
Figure 3 depicts the cumulative effectiveness and superposed control for the optimal control.For comparison purpose, the cumulative effectiveness and superposed control for three admissible static controls are also shown in Figure 3.
Example 12. Consider problem (16) in which  is a realistic network given in [55],  = 20,  = 0.001,  =  = 0.1,  =  = 0.7,  i = 0.1, 0 ≤  ≤ , and   (0) = 0.1, 0 ≤  ≤ .An optimal control to the problem is obtained by solving the optimality system (29).Figure 4 exhibits the cumulative effectiveness and superposed control for the optimal control.For comparison purpose, the cumulative effectiveness and superposed control for three admissible static controls are also shown in Figure 4.
It is seen from the above three examples that a most effective APT defense strategy is significantly superior to any static APT defense strategy in terms of the effectiveness.This observation justifies our method.Additionally, the superposed control drops rapidly to a lower value.

Further Discussions
This section is devoted to examining the influence of different factors on the optimal effectiveness of an admissible APT defense strategy.For ease in understanding these influences, let us introduce three quantities as follows.For an optimal control u * to problem (16), let OL * , OC * , and OJ * denote the corresponding expected loss, overall cost, and effectiveness, respectively.That is,

The Bounds on the Admissible Controls.
Definitely, the four bounds on the admissible controls affect the optimal effectiveness of an admissible APT defense strategy.Now, let us examine these influences.
(a) With the increase of the lower bounds, OL * goes down, but OC * and OJ * go up.In practice, the two lower bounds should be chosen carefully so that a balance between the expected loss and the overall cost is achieved.(b) The influence of the two upper bounds on OL * , OC * , and OJ * is almost negligible.

The Network Topology.
Obviously, the topology of the network in an organization affects the optimal effectiveness of an admissible APT defense strategy.Now, let us inspect this influence.
Example 16.Consider a set of problems (16) in which  ∈ {  : 1 ≤  ≤ 7}, where   is a scale-free network with  = 100 nodes and a power-law exponent of   = 2.7 + 0.1 × ,  = 20,  = 0.001,  =  = 0.1,  =  = 0.7,   = 0.1, 0 ≤  ≤ , and   (0) = 0.1, 0 ≤  ≤ . Figure 8 displays the influence of the power-law exponent on OL * , OC * , and OJ * , respectively.It is seen from this example that, with the increase of the power-law exponent of a scale-free network, OL * and OJ * decline, but OC * inclines.It is well known that the heterogeneity of a scale-free network increases with its power-law exponent.Therefore, a homogeneously mixed access network is better in terms of the optimal defense effectiveness.
It is seen from this example that, with the increase of the randomness of a small-world network, OL * , OC * , and OJ * rise rapidly.Hence, a randomly connected access network is better from the perspective of the optimal defense effectiveness.

Concluding Remarks
This paper has addressed the APT defense problem, that is, the problem of how to effectively defend against APTs.By introducing an APT attack-defense model and quantifying the effectiveness of an APT defense strategy, we have modeled the APT defense problem as an optimal control problem in which an optimal control represents a most effective APT defense strategy.Through theoretical study, we have presented the optimality system for the optimal control problem.This implies that an optimal control can be derived by solving the optimality system.The influence of some factors on the optimal effectiveness of an APT defense strategy has been examined.
There are many relevant problems to be resolved.The expected loss and overall cost of an APT defense strategy should be appropriately balanced to adapt to specific application scenarios.In practice, the implementation of a recommended defense strategy needs a great effort; the security level of all the systems in an organization must be labeled accurately [6], the defense budget must be made, and the robustness of the defense strategy must be evaluated.As the topology of the access network in an organization may well vary with time, the approach proposed in this work should be adapted to time-varying networks [56][57][58][59].It is of practical importance to deal with the APT defense problem in the game-theoretical framework, where the attacker is strategic [60][61][62][63].

Figure 1 :
Figure 1: The diagram of state transitions of a node under the hypotheses (H 1 )-(H 3 ).

Figure 2 :
Figure 2: The cumulative effectiveness and superposed control for the optimal control and a few static controls in Example 10.

Figure 3 :
Figure 3: The cumulative effectiveness and superposed control for the optimal control and a few static controls in Example 11.

Figure 4 :Figure 5 :Figure 6 :Figure 7 :Figure 8 :Example 14 .Example 15 .Figure 9 :
Figure 4: The cumulative effectiveness and superposed control for the optimal control and a few static controls in Example 12.