Modeling Coupled Nonlinear Multilayered Dynamics: Cyber Attack and Disruption of an Electric Grid

,


Introduction
A recent survey [1] of cyberattack in complex systems with both physical and cyber components in which the cyber system enables functionality of the physical system emphasizes the importance of improving our ability to understand this class of systems, and that it is critical to go beyond the attack-defense dynamics in cyberspace, but to include how changes in the cyber domain generate changes in physical outcomes (and possibly vice versa). at is, one wishes to assure that the physical system remains in the desired operating regime or out of undesired ones and needs to discover methods to achieve this goal. Modern power and communication systems are canonical examples of such systems [2,3] and in [1], the particular case of a smart grid subject disruption and stabilization through a Distributed Denial of Service (DDOS) or deception attacks that alters sensor information [4,5] is treated.
In this paper, we extend the ideas in [1] in a new direction by explicitly coupling the models of attack dynamics and the physical functional response of the system under attack. is allows us to analytically explore the system dynamics of the coupled nonlinear systems, in particular how parameterized attacks of certain types and behaviors result in effects observed in performance and functionality of the resulting system (in this case, a smart power grid). In doing so, we develop a modeling approach that creates the ability for attackers to optimize the timing and pace of their attacks, and for defenders and designers to select defense strategies, deploy defensive assets, and make architectural and design choices tuned to the coupled cyber-physical system.
To do this, we build on existing nonlinear dynamic models of compromise of cyber systems that import ideas from the population biology of disease [6][7][8][9][10][11], characterize performance using either a nonlinear metric for a generic physical system (cf. [6][7][8][9][10]) or the nonlinear dynamics of an electric grid, and introduce counter measures having nonlinear dynamics based on the immune systems [12,13].
Our goal is to develop a systems dynamics model to explore the roles of attack and maintenance rates, detection capability, and other design parameters on the dynamics, particularly how compromise of the cyber system propagates, and performance of the linked physical system responds. Since our goal is to understand the important variables and how they affect performance, we build a heuristic model [14] that is not specific to any particular situation but has commonalities with many complex systems. Using the model will help identify what to measure to be able to assess vulnerability to cyberattack, the consequences of attack on performance of the physical system, and to identify design tradeoffs and routes to defense.
We link the physical and cyber systems through a metric of performance of the physical system that depends upon the state of the cyber system, using (i) a generic nonlinear relationship between the state of the cyber system and the performance of the physical system and (ii) the synchronous motor model of an electric grid in which the load is a utility having consumers that used Advanced Metering Infrastructure (AMI; smart meters) that can be compromised.
In the next section, we develop the model for compromise of the cyber system, after which we introduce two metrics of performance of the physical system. We then illustrate the dynamics of the model for the cyber system, after which we treat two examples. First we show how the timing of attack when the attacker relies on stealth emerges from the nonlinear nature of both compromise and performance. We explore how performance of the physical system depends on parameters of compromise and performance and show how the optimal time of attack depends upon the parameters of the physical system and rates of external compromise and internal co-compromise.
Second, we explore compromise of Advanced Metering Infrastructure (AMI) and failure of the electric grid. AMI is an example of the Internet of ings, which increases the risk of cyber compromise because it increases the number of points of access for attackers [14][15][16][17][18][19]. In this example, we couple the model of compromise to the synchronous motor model of an electric grid to illustrate Electric Power Research Institute scenario AMI.27 [20] for reverse engineering of AMI, and a specific case of how compromising smart meters can lead to load-side failure of an electric grid. Because power grids are now considered critical infrastructure, data about them are often treated in a confidential manner [21,22]. By using a generally accessible model for the grid, we are able to clearly see how compromise propagates and interacts with performance of the grid.

Materials and Methods
We envision a complex system consisting of a physical system that is enabled by a cyber system that takes commands, exchanges information, and more generally interacts with the external world. Attacks on the cyber system lead to compromise of its components, which has an effect on the physical system. Our goal is to provide a framework for modeling compromise in the cyber system, linking the cyber and physical systems (both generically and specifically [the electric grid]), and use the model to explore the dynamics of compromise, attack, and recovery of the cyber system and the related performance of the physical system.
We first describe the conceptual framework for the model of the cyber system (Section 2.1), after which we derive the equations for the dynamics of compromise and recovery (Section 2.2). We then show how the dynamics simplify when the attacker relies on stealth (Section 2.3), in which compromise is built until either it is discovered during regular maintenance or an attack is executed. We consider two metrics of performance (Section 2.4): (i) a generic metric that depends upon the number of uncompromised cyber components, the number of compromised cyber components, and the level of Defensive Counter Measures (DCM) and (ii) a synchronous motor model for the electric grid for which the metric of performance is the stability of the grid.

Conceptual Description of the Cyber
Subsystem. An attack on a cyber system first requires external compromise: gaining access to components that interface with the external world. Once "inside" the cyber system, the second stage of internal co-compromise can commence, in which compromised components infect noncompromised components. Compromised components may immediately reduce the performance of the physical system or the adversary may hide compromise until ready to execute the attack.
In many cases, cyber components that interface with external world can be protected by external hardness [14] in which case antimalware prevents attackers from entering the cyber system. Cyber components that do not interface with the external world can similarly be protected from cocompromise by internal hardness [14]. External and internal hardness rely on a variety of mechanisms [21,23,24] that we do not model explicitly. However, it is now clear that external and internal hardness are necessary but not sufficient for both a reliable and resilient cyber system [21] and that resilience of the cyber system (and thus performance of the physical system) requires some form of Defensive Counter Measure (DCM) that returns the system to a state closer to the one before the attack. Protection from external compromise and internal co-compromise may not be effective (e.g., the installing antimalware does not defend) or may lose its effectiveness over time (e.g., the attacker discovers a way to circumvent the antimalware currently installed).
us, at any time, the cyber system consists of five classes of components ( Figure 1): First, uncompromised and vulnerable components can be compromised either externally or internally. We allow a fraction η of these components to be decoys, with no functionality, but instrumented to detect compromise with high probability. Second, uncompromised cyber components that are currently invulnerable to either external or internal compromise are temporarily protected against malware, but as time progresses their antimalware software ages and is no longer effective. ird, compromised cyber components are infected by malware. Fourth, once compromised components are discovered, they are temporarily removed from the cyber system to be restored or reset later [21]. ese components do not contribute to performance of the physical system. After some amount of time, components that are being reset return to the cyber system temporarily invulnerable (effective antimalware installed) or still vulnerable (either no antimalware installed or the installed antimalware is ineffective). Fifth, once compromise is detected, DCM are activated to discover and send compromised components into the resetting phase. Since DCM use cyber resources, we assume that their presence reduces the functionality of the physical system.

Dynamics of the Cyber System Components.
We assume that the cyber system has N components, with dynamics characterized by mass action in a mean field approximation [25][26][27][28][29].
Uncompromised and vulnerable cyber components, denoted by x 1 (t) transition to become compromised components (i) because of external compromise at a rate proportional to their numbers and (ii) because of internal co-compromise by previously compromised cyber components at a rate proportional to the numbers of both kinds of components, accounting for the decoy cyber components. In addition, temporarily invulnerable cyber components lose their protection and compromised cyber components that are reset without any protection increase the number of uncompromised and vulnerable components. We assume that the rate of these transitions is proportional to their number. Hence the dynamics are On the right hand side of equation (1), η is dimensionless; c is the rate of external compromise so that the fraction of uncompromised cyber components surviving external compromise from t to t + Δt is e − cΔt ; c s is the rate of co-compromise so that when there are y compromised cyber components at time t, the fraction of uncompromised cyber components surviving cocompromise to t + Δt is e − c s (1− η)yΔt ; g is the rate at which temporarily invulnerable cyber components become vulnerable; and f 1 is the rate at which compromised components being reset return vulnerable to compromise.
Uncompromised and temporarily invulnerable cyber components, denoted by x 2 (t), transition to the uncompromised and vulnerable state as they lose protection. Compromised cyber components that reset with temporary protection from external or internal compromise transition to the uncompromised and temporarily invulnerable states. We assume that the rates of these transitions are proportional to the number of components so that the dynamics are In this equation, f 2 is the rate at which compromised cyber components are reset and returned to the cyber system temporarily invulnerable. If f 1 ≠ 0 and f 2 � 0, then all cyber components that are reset return vulnerable to attack. On the other hand, if f 1 � 0 and f 2 ≠ 0, then all cyber components return temporarily invulnerable to attack. In the most general case, both f 1 and f 2 are nonzero, so that returned cyber components have a mixture of vulnerabilities.
Compromised cyber components, denoted by y(t), increase in number due to the processes in the first term on the right hand side of equation (1). Compromised cyber components transition to the resetting state due to regular maintenance during which compromised components may be discovered and when removed by DCM, denoted by z(t). We assume that the rates of these transitions are proportional to their number for regular maintenance and to the their number and the level of DCM when DCM are activated. We let Complexity I DCM (t) denote an indicator function that is 1 if DCM are activated at time t and is 0 otherwise. e dynamics of compromised cyber components are then where μ m is the rate of transition of compromised cyber components to resetting during regular maintenance and μ DCM is the rate of transition of compromised cyber components to resetting by DCM. Defensive Counter Measures are intensive malware eradication efforts. We do not specify the mechanism of the DCM, since they are situation dependent [23]. DCM are active only after compromise is detected, which may occur during regular maintenance, through recognition of external compromise, or anomalous performance of the physical system [30][31][32]. After compromise is detected, we model the dynamics of DCM, denoted by z(t), in analogy to T cells of the immune system [12,13].
If α � 0 in equation (4), DCM are active only when compromised cyber components are present and in absence of compromised cyber components, the only steady state is z � 0.
is is the situation we assume for computations. When α > 0, there is a positive steady-state level of DCM even when there is no compromise in the system, providing ready defense of the cyber system at the possible expense of performance of the physical system (see below). e parameter β controls the rate of increase of DCM; in particular, when β > 0, the rate of increase of DCM declines as the level of DCM increases.
Because cyber components are neither created nor destroyed, the numbers of uncompromised and vulnerable, uncompromised and temporarily invulnerable, compromised, and resetting components sums to N. is is captured by the second terms on the right hand sides of equations (1) and (2).

e Detection of Compromise.
A variety of methods have been proposed for detecting compromise [30], but in general the detection of compromise will be imperfect, because regardless of the specific process (e.g., signal matching or anomalous behavior) there will be both false positives and false negatives [14]. We let U(t) denote the probability that compromise remains undetected at time t.
We assume that the probability of discovering compromise in the next Δt units of time when the rate of external compromise is c and y cyber components are compromised is ([ε 1 (1 − η) + ε 2 η]y + ε c c)Δt + o(Δt) where ε 1 and ε 2 are the rates of detection of compromised nondecoy and compromised decoy cyber components, respectively, and ε c is the rate of detection of compromising activity when the rate of external compromise is c. In general, we anticipate that ε 2 > ε 1 . For simplicity, for computations here we set ε c � 0.
Assuming that compromise is undetected at the start, Equation (5) generates a trajectory U(t) for the probability that compromise is not detected by time t. We use this trajectory to characterize the stochastic process of detection.
at is, from U(t), we generate realizations of the time at which compromise is detected, and explore the consequences of different times of detection on the performance of the physical system.  Figure 1: e cyber system contains components that are uncompromised and vulnerable, uncompromised and currently invulnerable (hardened), compromised, and resetting (and thus temporarily unavailable system; metaphorically "in the shop"). A fraction η of the cyber components (shown only for uncompromised and vulnerable components) are decoys that do not contribute to functionality of the physical system and are unable to co-compromise [20]. Dotted lines represent transitions from one stage to another and solid lines represent either external compromise or co-compromise. 4 Complexity

Summary: a Mixed Deterministic-Stochastic Nonlinear
Model. e full model for dynamics of the cyber system consists of equations (1)- (5). Because of equations (1)-(4), the model is inherently nonlinear. Because of equation (5), it is stochastic, even though the rest of the dynamics are deterministic. We implement this mixed deterministic-stochastic model by solving equations (1)-(5) with I DCM ≡ 0 to generate the trajectory of U(t). Given this trajectory, we generate K times of detection, denoted by t d (k), k � 1, 2, . . . , K by drawing uniformly distributed random variables and comparing them with U(t). en, for every t d (k), we solve equations (1)-(5) to determine the entire trajectory of compromise, detection, and recovery from compromise.
For the base case, we used these parameters: All computations were done in R Studio Version 1.0.143 with underlying R version 3.6.1.

Simplification When the Attacker Relies on Stealth.
When an adversary relies on stealth, compromise is hidden until the attack is executed. at is, cyber compromise may be ongoing for a long period of time and undetected [14,21]. In general, the longer an attacker waits to initiate the attack, the more likely it is that compromise will be detected and removed [33][34][35], suggesting that an optimal time of attack will emerge from the dynamics of compromise and detection.
To model this situation, we assume that compromise (both external and internal) increases until (i) it is detected through regular maintenance or (ii) the attacker decides to launch an attack at time t A . We assume that if compromise is detected before the attack is launched, the value to the attacker is 0. Otherwise, when the hidden compromise becomes active, assessing the value of the attack requires a metric of utility for the attack, which we consider to be (1) the number of compromised cyber components or (2) the reduction in performance of the physical system.

Dynamics before Attack or Detection.
For simplicity, we assume all cyber components are initially vulnerable to compromise. Since compromise is hidden until attack or discovery, x 2 � 0 always and no cyber components are being reset. Consequently before attack or detection, only equations (1), (3), and (5) are relevant. at is, cyber components are either un-compromised and vulnerable or compromised, but compromised cyber components are not recognized until either detected or the attack is executed. We let x denote the number of un-compromised and vulnerable cyber components so that equations (1) and (3) reduce to Since which has solution from which y(t) follows directly.

e Value of Attack.
e probability that compromise remains hidden until the time of attack t A is U(t A ), determined by the solution of equation (5). e value of an attack requires that we introduce a utility function for the attacker. We will consider two choices. First, the value of attack may be measured in terms of the number of compromised cyber components at the time of attack. Second, the value of attack may be measured in terms of reduced performance of the physical system after the attack is executed.
e value of attack measured in terms of the expected number of compromised cyber components is Since y(t) is an increasing function of time and U(t) is a decreasing function of time, their product will have a peak, leading to an optimal time of attack and times of attack around that optimum that are "pretty" good, in the sense of giving nearly the same value as the value at the optimal time of attack.
Assessing the value of attack measured in reduced performance of the physical system due to compromised cyber components requires a metric ϕ(x, y, z) for the performance of the physical system when the cyber system has x uncompromised and y compromised components, and level of DCM z. We describe such a performance function in the next section.
Given ϕ(x, y, z), the value to the attacker is the difference between ϕ(N, 0, 0) -performance of the physical system in the absence of compromised cyber components and DCM-and ϕ(x(t A ), y(t A ), 0)-the performance of the physical subsystem when there are x(t A ) un-compromised cyber components and y(t A ) compromised cyber components at the time of attack, and no active DCM. e expected value of this performance function is Other choices of utility functions are possible, although as will be seen in the next section, a generic metric of performance of the physical system can capture a wide range of utility functions, from threshold relationships to linear ones. We also note that the value of attack to the adversary may not be the same as the cost to the owner of the system.

Metrics of Performance of the Cyber Physical System.
In general, the cyber and physical systems are coupled nonlinearly. We first consider a generic measure of Complexity performance in which performance of the physical system is a function of the number of uncompromised cyber components, the fraction of decoy cyber components, the number of compromised cyber components, and the level of DCM. Performance increases as the total number of uncompromised, nondecoy cyber components increases, and decreases as both the total number of compromised cyber components y increases and the level of DCM increases.
We then turn to the synchronous motor model of the electric grid, with stability as the metric [21,32], although we will show that both reliability and resilience of the grid emerge from the model.

A Generic Metric of Performance.
We model performance function of the physical system as follows. First, we assume that performance of the physical system increases with the number of uncompromised cyber components according to where x 50 is the number of uncompromised cyber components at which performance of the physical system is 50% of its maximum value (set to 1 without loss of generality) and σ x captures the dispersal in performance of the physical system ( Figure 2). Equation (11) can accommodate a variety of assumptions about performance of the physical system. For example, if performance of the physical system is approximately a linear function of the total number of uncompromised cyber components, then parameters giving the dotted line in Figure 2(c) are appropriate and x 50 can be chosen to give the 50% point of performance. On the other hand, if the physical system is a communications system and only one or a few cyber components need to be uncompromised for a message to get through, then a small value of x 50 is appropriate. In communications systems, performance will be determined by bandwidth, connectivity, and message accuracy, all of which can be compromised; mission performance metrics are accuracy, timeliness, and completeness of the information. Mapping these onto the sigmoidal functions requires detailed knowledge of the particular system and is thus beyond the scope of this paper.
We use a similar kind of sigmoid to characterize the reduction in performance due to compromise of the cyber system and assume that DCM use computational bandwidth, thus also reducing performance of the physical system. We assume that when the level of DCM is z, performance of the physical system is reduced by a factor e − ωz , where ω > 0.
A generic metric of performance of the physical system when a fraction η of the cyber components are decoys, and there are X � x 1 + x 2 uncompromised cyber components, y compromised cyber components, and the level of DCM is z, is then If the attacker needs to compromise many cyber components to degrade performance of the physical system, we set y 50 to a moderately large value. On the other hand, if the attacker can shut down the physical system by compromising a few cyber components, then y 50 will be small. e 1 − η terms in equation (12) capture that decoy components in the cyber system are not part of the functionality of the physical system.

2.4.2.
e One Generator-One Load-Many Consumers Electric Grid. One approach to modelling complex electric grids is to use second-order oscillator equations derived from Kirchoff's laws that balance power at each node of the grid . at is, a power grid is modeled as a system of electrically coupled devices that deliver power from generators to machines/loads via transmission lines and a key objective of management of electrical grids is that generators and loads are properly synchronized. When a load is too strong, loads are unevenly distributed, or a major disruption occurs, and a generator or load may lose synchrony [20]. If this is sufficiently strong, it may lead to grid failure [50][51][52][53][54][55].
In the synchronous motor model, the i th generator or load is characterized by a power balance equation [53] of the form P i � P i,diss + P i,acc + P i,transmitted where the terms on the right hand side are, respectively, the rate at which the motor dissipates energy, the rate at which it accumulates energy, and the rate at which it transmits energy. e state of each generator or load is described by the rotor angle, measured relative to a reference device rotating at a standardized frequency, and satisfying the swing equation [50][51][52][53][54] where Θ i is the deviation of the i th rotor from the reference frequency, I i is the moment of inertia of the rotor times the reference frequency, D i is the coefficient of damping/friction as the rotor revolves, and P i,mech and P i,elec are, respectively, the mechanical power generated and the net electrical power transmitted through transmission lines. A power grid is modeled as a coupled collection of equations similar to equation (13); and the deviation from the reference frequency for the i th generator or load, denoted by θ i , satisfies [53].
where δ i is the dissipation occurring in the i th generator or load, λ is the coupling strength of the transmission line, and A ij is a matrix whose entries are 1 if generators/loads i and j are connected, and 0 otherwise. P i > 0 for generators and 6 Complexity P i < 0 for loads. e power network will be in a steady state when consumption and generation of energy balance, so that if there are in total N generators and loads N i�1 P i � 0. In order to focus on compromise in the demand-side cyber subsystem, we use the simplest grid, as in [56], consisting of a single generator (P g � P 0 ) and a single load (P l � − P 0 ) (Figure 3) envisioned as a utility with many customers having AMI. e physical system consists of generators, substations, transformers, and towers and transmission lines; there are two cyber systems coupled to it. e first cyber system is the Supervisory Control And Data Acquisition (SCADA) [2,[37][38][39] system that collects measurements from substations and sends out control signals to equipment, substation automation or protection systems, and energy management systems.
e second is the cyber system of consumers with AMI. To simplify the analysis, we focus on compromise of the latter cyber system and how compromise in it can destabilize the grid.
We let θ g and θ l denote the phase deviations for the generator and utility. In this case, A ij reduces to a single value; we replace λA ij by K, for simplicity set the dissipation parameters δ g � δ l � δ, replace P i by P 0 for the generator and − P 0 for the load, so that the equations for generator and load are Subtracting the second equation from the first equation, we obtain a single second-order equation for the phase difference ϕ(t) � θ g (t) − θ l (t), written as the first-order system  50 and σ x . When X � x 50 , the fraction is 0.5 so that x 50 is the number of uncompromised cyber components at which performance of the physical system is 50% of its maximum value. e parameter σ x captures the dispersal in performance. As σ x declines, performance becomes more knife-edged; in the limit that σ x � 0 performance is a step function that is 0 for x < x 50 , 1/2 at X � x 50 , and 1 for X > x 50 . (b) Holding σ x � 10, we show the sigmoid for x 50 � 5, 40, or 80 (dashed, solid, and dotted lines, respectively). (c) Holding x 50 � 40, we show the sigmoid for σ x � 1, 10, or 30 (dashed, solid, and dotted lines, respectively).
For this model, the transmitted power is P trans ∝ K sin(ϕ) [53], and a natural metric of performance of this grid model is that the transmitted power is in a steady state. If P 0 ≤ K, this grid has a steady state at v � 0, ϕ � arcsin(P 0 /K). If P 0 > K, there is no steady state; instead the solution cycles. However, when P 0 < K but close to K, if the dissipation coefficient is small enough, a periodic solution (and therefore unstable operation of the grid) coexists with a steady state. In particular [57][58][59], for each value of P 0 , there is a critical value of the dissipation coefficient, which we denote by δ c (P 0 ) so that if δ > δ c (P 0 ) then there is no periodic solution and the grid will be globally stable. However, when 0 < δ < δ c (P 0 ), there is also a periodic solution ( [52], Figure 2) so that the operation of the grid will be unstable; Rohden et al. [52] identify this situation with a power outage. Major power grids are often operated in the region in which K is on the order of P 0 to reduce the cost of overcapacity in transmission lines [53], which makes the possibility of transition from a steady state to oscillatory or chaotic behavior more likely.

Results and Discussion
We begin (Section 3.1) by illustrating the basic dynamics of the model and associated performance of the physical system using the generic performance metric. ese show that a steady-state level of compromise is reached, even with active defense, consistent with the observation that one should assume that cyber systems have been penetrated and focus on determining the consequences of the penetration [14]. Because of the mixed deterministic-stochastic nature of the model of compromise, we report distributions for the time of detection and level of compromise at the time of detection and then show realizations of the dynamics. As a sensitivity analysis (Section 3.2), we explore how the dynamics of compromise and performance of the physical system depends upon the fraction of decoy cyber components η. We then (Section 3.3) show results when an attacker relies on stealth, particularly how the optimal time of attack emerges and how it depends upon parameters of the system and the choice of metric for value to the attacker. We begin the section on the grid model by briefly summarizing (Section 3.4.1) the dynamics of the one generator-one load grid in the absence of compromise and then (Section 3.4.2) couple the model of compromise to the model of the grid. We illustrate EPRI scenario AMI.27 [20] for reverse engineering AMI by showing how smart meters sending misleading signals about power demand can lead to load-side failure of the electric grid and derive a canonical condition for failure of the grid in terms of the number of compromised components at the time of detection and the dissipation parameter of the synchronous motor model.

Dynamics of Compromise and Recovery.
For these results, we set η � 0. Absent DCM, compromise is removed only through regular maintenance. In Figure 4, we show the dynamics of the number of cyber components (Figure 4(a)), and performance of the physical system and the probability that compromise is detected (Figure 4(b)). We use the dynamics of the probability of detection to create stochastic realizations of the time of detection (Figure 4(c)) and the number of compromised cyber components at the time of detection of compromise (Figure 4(d)).
e cyber system reaches a steady state in which compromised cyber components co-exist with vulnerable and temporarily invulnerable cyber components. Consequently, performance of the physical system declines to a steady state. Performance in Figure 4(b) can be compared directly to Figure 3.14 in [21], showing the time course of the fraction of the load delivered before, during, and after a cyberattack or Figures 2.1-2 Figure 3: An electric grid consisting of one generator (with rotor angle θ g ) and one load (with rotor angle θ l ), in this case envisioned as a utility company that has N consumers, indexed by n � 1, 2, 3, . . . , N, with AMI. Our analysis focuses on the consequences of compromise of the AMI rather than attacks on the cyber system of the power generating system. 8 Complexity physical system may increase because of the removal of compromise, performance of the physical system need not return to its previous state following disruption/attack (see below). Conditioned on time of detection, the behavior of the full system is captured by equations (1)-(5), which we illustrate by choosing 8 times of detection that capture most of the range in Figure 4(c).
e cyber system recovers to more than 90% of its initial state (left panels in Figure 5). at is in the steady state, and regardless of the time of detection of compromise, there are uncompromised and vulnerable, uncompromised and currently hardened, and compromised cyber components (with the remainder resetting). Compromised components of the cyber system persist in the steady state because both external attack and internal co-compromise continue. e number of compromised components remains at a low level because DCM are maintained at a nonzero level ( Figure 5, right hand panels, blue lines). e cost of a nonzero steady state of DCM is reduced performance. To illustrate this point, in Figure 6, we plot performance for the same times of detection as in Figure 5. Performance drops as compromise builds before detection of compromise and the minimum level of performance depends upon the time of detection of compromise. However, in the steady state, performance only returns to about 70% of its initial value.

e Role of Decoy Cyber Components.
To explore the role of decoy cyber components, we swept over eight values of η, ranging from 0 to 0.25 (in the spirit of [60]). For each value of η, we determined 300 times of detection. Conditioned on both η and the time of detection, we computed the dynamics of the cyber system and the performance of the physical system.
Because of the formulation (Equations (1), (3), and (5)) as η increases, the probability of detecting compromise by a given time will increase, and the number of compromised cyber components at a given time will decline (Figures 7(a)  and 7(b)). What cannot be predicted is that the standard deviation of both the time of detection and the number of compromised components at the time of detection of compromise decline as η increases.
In light of the results shown in Figures 5 and 6, we anticipate once DCM are activated, the system will recover and reach a steady state that is independent of both the time at which compromise is detected (the stochastic component) and varies in a deterministic way with η. e consequence (Figure 7(c)) is that although the mean of the steady-state performance of the physical system declines with η, the standard deviation of steady-state performance is virtually 0.
On the other hand, the minimum performance of the physical system depends upon the time at which compromise is detected (as in Figure 6), and thus on the value of η. Minimum performance is determined both by the level of compromise of cyber components at the time that compromise is detected and the response of DCM. In

Stealth and the Optimal Time of Attack.
When the attacker relies on stealth, the results become deterministic because the value to the attacker is an expectation over the time of detection In Figure 8(a), we show the time course of the two value functions. An optimal time of attack clearly emerges from these figures, as does a range of times of attack for which the value of attack is "pretty good" [61]. Conditioned on the other parameters, we predict the attack will occur earlier when value to the attacker is measured in terms of loss of performance of the physical system rather than the number of compromised cyber components. In Figures 8(b) and 8(c) we show the optimal time of attack across a range of values of c and c s .

Load Side Failure of the One Generator-One Load Grid due to Compromise in AMI.
We begin by summarizing the properties of the grid model in the absence of any compromise and then link it to the model of compromise.

Properties of the Grid Model in the Absence of
Compromise. In the absence of compromise, the parameters K � 0.4, δ � 0.15, and P 0 � 0.25K, 0.35K or 0.55K and initial conditions ϕ(0) � π/4 and v(0) � 0 lead to a steady state for equation (16). After transient initial dynamics, the steady state given by v � 0, ϕ � arcsin(P 0 /K) is reached. Since transmitted power is P trans ∝ K sin(ϕ) and the steady-state value of ϕ increases with P 0 , transmitted power also increases with P 0 .
As discussed above, as P 0 approaches K from below, instability may occur. For example, with all other parameters as above, for P 0 equal to 0.94K or 0.95K, the grid reaches a steady state, but for P 0 � 0.96K an oscillatory solution ese results suggest that one route to the instability of the electric grid (which we explore in detail below) is increased demands of power. To illustrate the concept without the complexity of the model for compromise and detection, we solved equation (16) for δ � 0.15, K � 0.4 and P 0 � 0.5K(1 + ε p ) where ε p � 0.75, 0.8, 0.85, 0.9, and 0.95 characterizes the increase in demanded power. In this case, P 0 /K � 0.5(1 + ε p ) < 1 so that a steady state exists. However, we expect instability before ε p reaches 1. is is indeed the case-for a value of ε p between 0.831 and 0.832, the grid loses stability. Furthermore, increasing δ makes the system more stable: when δ � 0.2, the value of ε p leading to instability falls between 0.8588 and 0.8589 and when δ � 0.25 the value of ε p leading to instability falls between 0.8844 and 0.8845. is is an example of criticality in physical systems as they operate closer to their capacity and become more fragile to perturbations. One of the roles of models such as we develop here is to help recognize and characterize potential modes of failure of the physical system.

Load
Side Failure of the One Generator-One Load Grid due to Compromise of AMI. We couple the models for compromise and its detection and for the electric grid by replacing P 0 by P 0 (1 + ε d y(t)), where ε d is how much one compromised AMI increases the demand for power, and y(t), computed from equation (3), is the number of compromised AMI at time t. We assumed ε d � 0.04 for computations. In Figure 9, we show transmitted power assuming that the dynamics of compromise and times of detection are the same as in shown in Figure 5. e grid shows a signal of compromise with a secular, almost linear increase in the demand for power (the portion of the trajectories before the time of detection represented by the vertical dotted line). If detection occurs early enough (as in the first six panels), DCM are activated, compromise is reduced, and the grid returns to a stable steady state. However, as shown in the bottom two panels, if detection occurs too late, then the grid enters a region of instability even after DCM are activated.
To compute the risk to the grid due to compromise of AMI in the cyber system, we note that when power demanded by the load is P 0 (1 + ε d y(t)) and compromise is detected at time t d , the condition for instability of the grid is which also can be written as Equation (18) is a canonical relationship linking compromise and load-side failure of the grid. Since detection of compromise is a stochastic process, the time of detection and  (16) is replaced by P 0 (1 + ε d y(t)) with ε d � 0.04 representing how much one compromised AMI increases the demand for power, and y(t) is the number of compromised cyber components. e dynamics of compromise and times of detection are the same as in shown in Figure 5-the vertical dotted line denotes the time at which compromise is detected.

Complexity 13
level of compromise at the time of detection vary. Because detection of compromise is a random process, the level of compromise on the left-hand side of Equation (18) is a random variable. e risk of grid failure is the probability that the level of compromise on the left-hand side of Equation (18) exceeds the ratio on the right-hand side of that equation. In Figure 10, we show risk to the grid as a function of the per-compromised AMI increase in demanded power. If the per-compromised AMI increase in demanded power ε d is small enough, the risk to the grid from compromise is virtually 0. ere is a range in which the risk to the grid is a linear function of ε d . For large enough values of ε d , failure of the grid is guaranteed. us, knowing the increase in demanded power by a compromised cyber component can be useful for predicting grid failure or not. For example, drawing a horizontal line in Figure 10(a) or 10(b) at the level of risk to the grid that can be tolerated and locating the value of ε d at which the line intersects the curve will provide information on the value of per AMI increased demand in power that can be tolerated. e increase in demand is an anomalous operating condition, which may be used for detection of compromise ( [21], Recommendation 4.10, pg 92).

Conclusions
Our work involves coupling of a set of nonlinear differential equations for the dynamics of compromise and defense of the cyber system with either (i) a nonlinear generic performance function to understand cyberattack via stealth or (ii) the nonlinear rotor equations for an electric grid to understand the role of load-side compromise on the stability of the grid. Doing so allowed us to explore how an optimal time of attack when using stealth emerges (Figures 8 and 9) and to derive a canonical condition (Equation (17) or (18)) for stability of the electric grid in the face of load-side compromise.

Novel Contributions of this Paper and Connection to Recent Other Work.
e innovations of our work include (i) a fraction of decoy components in the cyber system, which are not connected to the rest of the cyber system or the physical system and thus do not spread compromise but increase the probability of detection of compromise, (ii) allowing components of the cyber system to return to the un-compromised state either temporarily invulnerable or immediately vulnerable, (iii) adaptive Defensive Counter Measures that respond adaptively to attack and compromise, (iv) a generic metric of performance of the physical system that depends upon the state of the cyber system, and (v) coupling a model of the electric grid to the model of compromise of the cyber system that leads to a condition for failure of the grid in terms of parameters of both compromise and the synchronous motor model.

Directions for Future Work.
We discuss future work that is related to the model of cyber compromise and the generic performance function, the model of cyber compromise and the electric grid, and how the methods and results presented in our paper relate to recent studies [1,62] on the general topic of security in systems with cyber and physical components.

Related to Cyber Compromise and the Generic Performance Function.
Topics emerging from the model for cyber compromise and generic performance function warranting future investigation include: (i) e dynamics of compromise and recovery of the cyber system show that there is an opportunity for further investigation of operational resilience by planning for degradation of the cyber system when designing the linked physical system and (ii) when the attacker relies on stealth, a natural extension of our results is to explore how Figure 8 changes when the probability of detection depends upon the rate of external compromise, which sometimes generates a "wake" that is detectable.

Related to Cyber Compromise and the Electric Grid.
We have shown how load-side compromise can destabilize an electric grid. If the metric of performance for an electric grid is the fraction of the load delivered, then equation (11)  does not need any kind of rescaling to capture performance of the physical system, as can be seen by comparison of our Figure 2 with Figure 3.14 in [21]. Although defending the grid by disconnecting the utility from the generator is one route to defense, it is an extreme solution. On the other hand, developing a mechanism to recognize when individual consumers are demanding an anomalous amount of power and then disconnecting those consumers from the utility is a potential defense. Smart meters report on regular schedule with a defined message structure in a predictable way [21]. Signs of compromise or malfunction could include another reporting structure, reports at unusual times, or reported values that are outside of the usual range [18]. An implication of these results is that it is important to match the detection strategy to the willingness to accept risk when the underlying demand for power has a temporal pattern.
at is, at times of high demand for power, compromise in the cyber components of AMI may lead to instability of the grid, suggesting that the strategy for detection of compromise should be adjusted to the pattern of demands for power. Models such as we have developed here can help in the allocation of resources for detection. To make this idea fully operational, one would include a detection model [30] with false positives (consumers are not requesting additional power but are seen to be doing so), false negatives (consumers with compromised AMI are not detected), and an economic model to assess the costs of power disruptions [2]. Doing so is beyond the scope of this paper.
At the level of the utility, one defense is to have a mechanism, via generalized governors or electric springs [63], that increases the value of δ as the perceived demand for power increases. In this case, the cost is that the utility operates less efficiently as δ increases. As in the case of disconnecting consumers, developing the requisite models is feasible but beyond the scope of this paper.
Another future step is to expand the number of generators and loads. e Zealand grid [53] is an excellent candidate for such a next step. Alternatively, one of the recommendations concerning resilience of electric grids is to rely on various types of micro-grids [21]. By their very structure, such grids will have many points of access that interface with the external world [51], thus raising an entirely new set of issues for which the models in this paper can be applied. Similarly, the development of consumer-generated solar power that can be moved to the electric company (thus changing the historical one way distribution of electricity from the power company to the consumer to bidirectional transfer of power) raises issues for which the models developed here can be adapted. Furthermore, as the electric grid becomes more and more dispersed, resistance to and recovery from cyberattacks will increasingly depend upon rapid or even real-time measurements and responses [22].

Connections to Other Recent Work.
We now briefly discuss how the ideas in our paper link to [1] and an additional recent survey [62] of filtering in networked nonlinear systems. In [1], the authors describe approaches to tolerate cyberattacks (based approaches such as game theory or control theory); these can be explicitly modeled and tested for functional effectiveness of the cyber system using our approach. Our work complements that in [1] and in the future, our work can be used to explore the impact of cyberattack on a smart grid, examining the potential for access-and-maneuver types of attacks to disrupt system control and place the power system into ineffective and, in some cases, destructive states. e survey in [62] raises a number of potential extensions of our work and directions for future exploration including.
(i) Our methods can be used for the analysis of particular communications and contract protocols, investigating within the system context which protocols are more or less resilient than others and the performance and effectiveness of different communications protocols/policies under different network conditions. (ii) Using our methods, one can directly experiment with computational representation of more granular network-induced complexities, and validate the generality of the analytic method and associated assumptions. (iii) A natural extension of our analytic methods is to treat other types or sources of volatility that may propagate or affect nodes in different ways. (iv) Because of the explicit representation of the compromise of the cyber system, our methods can be used to explore how filtering methods can be used to improve response in the face of loss of nodes when nodes play both sensing and communication roles and to explore the consequences when sensor and communications functions are impacted asymmetrically [64]. (v) At a more theoretical level, our methods can be used to examine applicability of set-membership filtering to deal with system information censored or otherwise unavailable due to cyberattack [65].

Final Conclusion.
In this paper, we provided a framework based on the population biology of disease for the analysis of compromise in complex systems with cyber and physical systems, gave examples of how the framework could be used for both a generic metric of performance and for a metric of performance involving the electric grid, and suggested opportunities for further explanation. Much remains to be done using this framework.
other individuals to access the ideas directly. us, efficiency of computation has been sacrificed for clarity of development and presentation. In general, we adapt the form of the pseudocode from [65] and continue to use the mathematical notation from the paper, which is modified in clear ways in the actual code. Rscript can be obtained from the first author at marcmangel@protonmail.com.
A.1. Create the Generic Performance Function. To create the generic performance function in equation (12), use these steps.

A.3. Run the Model of Compromise in the Absence of Defensive
Countermeasure. In order to produce the results shown in Figure 4(a), solve equations (1)- (3) and (5) in the absence of DCM (that is, set z ≡ 0). We used the 4 th order Runge Kutta scheme in the package deSolve in R. e steps are as follows: (i) Specify the time increment dt when solving equations (1)- (3) and (5). (ii) Dimension the dynamic variables x 1 , x 2 , y, and U as vectors of length T/dt. (iii) Specify the initial conditions; if all components are initially uncompromised (as in this paper) these are x 1 � N, x 2 � 0, y � 0 and U � 1. (iv) To produce Figure 4(b), solve the differential equations and link to the generic performance function from Section A.1.

A.4. Create the Distribution of Times at which Compromise is Detected and the Level of Compromise at the Time of Detection.
In order to produce the results shown in Figures 4(c) and 4(d), follow the steps given below: (i) Specify the number of replicates N sim of the time of detection and create vectors t dec and y dec of length N sim (ii) Cycle n sim from 1 to N sim (iii) For each n sim , draw a random variable U uniformly distributed on [0, 1] and determine the time t for which U(t) > U and U(t + dt) ≤ U. Set t dec (n sim ) equal to this time and y dec � y(t dec ), where y(t) is computed from Section A. 3.
A.5. Run the Model of Compromise when Defensive Countermeasures are Activated. In order to produce the results shown in Figures 5 and 6 in which a range of times of detection is systematically evaluated, we now solve the full deterministic-stochastic model, which requires the indicator function I DCM (t), which will be 0 for times less than the time of detection t dec and 1 for times greater than it. We model this as the cumulative Gaussian distribution function, which is essentially a step function for small enough standard deviations. us, in the code, we compute by I DCM (t) by adding one more equation to equations (1)- (5): where σ I is the standard deviation (for computations, we used σ I � 0.1) and t dec is the time at which compromise is detected. Once equation (A.1) is appended to equations (1)-(5), one proceeds as follows: (i) Specify the value of σ I and the range of times of detection; for the results shown in the paper, we used the vector t dec � (10,15,20,25,30,35,40,45). (ii) Cycle over detection times. (iii) Follow the same steps as in Section A.3 with equation (A.1) appended. (iv) For each value of t dec , confirm that the I DCM (t) so generated is essentially a step function at t dec .
A.6. Sweeping Over the Fraction of Decoy Components. In order to explore the role of decoy components (Figure 7), we convert η from a scalar to a vector; for computations we used η � (0, 0.025, 0.05, 0.0075, 0.1, 0.15, 0.2, 0.25) and then proceed as follows: (i) Convert the fraction of decoy components to a vector with the range to be explored. (ii) Cycle over η (iii) For each value of η repeat the steps in Sections A.3-A.5; doing so generates all of the data needed for Figure 7.
A.7. When the Attacker Relies on Stealth. When the attacker relies on stealth (Equation (8), Figure 8), we have an explicit solution for the number of uncompromised cyber components (x(t) in equation (8)) and the number of compromised cyber components (y(t) � N − x(t). is allows rapid exploration by sweeping over the rate of external compromise c and rate of co-compromise c s , implemented with these steps.
(i) Specify the range over which the rates of compromise and co-compromise are explored and create or replace the scalars c and c s by vectors. Specify a vector for the time t A at which attack is initiated. (ii) Cycle over the rate of external compromise c and the rate of co-compromise c s (iii) Cycle over time from t � 0 to t � T. (iv) For each time, compute x(t) from equation (8)  A.8. e One Generator-One Load Model for the Electric Grid in the Absence of Compromise. As noted in the text, the one generator-one load model may show instability depending on the parameter values in equation (16), for which the transmitted power is proportional to K sin(ϕ). In order to explore the nature of this instability, proceed with these steps.
(i) Specify the scalars K and δ and a vector for the values of P 0 in equation (16). (ii) As above, dimension vectors for v and ϕ of length T/dt and specify their initial values. Solve equation (16) (we used deSolve in R). (iii) Repeat the previous step with different values of δ to numerically determine the value of δ at which an instability develops for given values of K and P 0 . (iv) In order to initially explore the failure of the grid due to increasing demand, choose values of K, δ, and P 0 for which the solution of equation (16) is stable. Create a vector ε p that increases the demanded power. (v) Cycle over ε p and for each value of it, solve equation (16) as above to determine whether the grid has stable or oscillatory behavior.

A.9. Couple the Model of Compromise and the Mode of the Electric Grid.
One is now in a position to couple the models for compromise and the electric grid, in which P 0 in equation (16) is replaced by P 0 (1 + ε d y(t)) where y(t) is determined during the solution of equations (1)-(5) and proceeds with these steps.
(i) Specify the value of ε d (ii) Meld Sections A.5 and A.8 above, with P 0 in equation (16) being replaced by P 0 (1 + ε d y(t)). is will be sufficient to produce the results shown in Figure 9. To produce the results shown in Figure 10, one adds the steps. (iii) Specify a vector for the increase in demanded power ε d (which provides the x-axis in Figure 10) and critical values for ε p determined from Section A.8; denote this vector by ε c .
(iv) For each combination of time of detection t dec , ε d , and ε c , find the fraction of y(t dec ) that exceeds the threshold given in equation (18).

Data Availability
All the results in this paper were generated using R version 3.6.1 (available at the CRAN website https://cran.r-project. org/) in RStudio version 1.0.143. R script for the models used in this paper can be obtained directly from the first author.

Conflicts of Interest
e authors have no conflicts of interest.