^{1}

^{2}

^{3}

^{2}

^{3}

^{2}

^{3}

^{2}

^{3}

^{2}

^{3}

^{1}

^{2}

^{3}

Security issue against different attacks is the core topic of cyberphysical systems (CPSs). In this paper, optimal control theory, reinforcement learning (RL), and neural networks (NNs) are integrated to provide a brief overview of optimal robust control strategies for a benchmark power system. First, the benchmark power system models with actuator and sensor attacks are considered. Second, we investigate the optimal control issue for the nominal system and review the state-of-the-art RL methods along with the NN implementation. Third, we propose several robust control strategies for different types of cyberphysical attacks via the optimal control design, and stability proofs are derived through Lyapunov theory. Furthermore, the stability analysis with the NN approximation error, which is rarely discussed in the previous works, is studied in this paper. Finally, two different simulation examples demonstrate the effectiveness of our proposed methods.

With the development of cloud computing, artificial intelligence, and 5th-generation, the power systems regarded as the primary infrastructures in society become typical CPSs [

Generally, the security of CPSs is threatened by attacks from the perception layer, cyberlayer, and decision layer. In particular, the attacks at the perception layer and cyberlayer, known as cyberattacks, severely disrupt the system. In recent years, reliable control strategies against various cyberattacks, such as false data injection attacks, time-delay switch attacks, and denial-of-service attacks, have been presented by many scholars. Denial-of-service attacks, which can jam information transmission channel, are an aggressive threat to CPS security [

With the integration of multiple energy sources, the control platform and information transmission are extremely complicated [

This paper concentrates on the study of an optimal robust control strategy, where the designed unified control method makes the power system immune to the actuator and sensor attacks. We use optimal control theory, reinforcement learning (RL), and neural networks (NNs) to design the controller under the assumed attacks of multiple characteristics. The main works and contributions can be summarized as follows:

Optimal control theory, RL, and NNs are integrated to address the security issue of a benchmark power system.

A unified way is proposed to deal with the sensor and actuator attacks via the optimal control design.

The stability analysis with the NN approximation error, which is rarely discussed in the previous works, is studied in this paper.

The rest of this paper is arranged as follows.

First, the benchmark power system models with actuator and sensor attacks are formulated. Second, the optimal control issue for the nominal system is investigated, and the state-of-the-art RL methods along with the NN implementations are reviewed. Third, several robust control strategies are proposed for different types of cyberphysical attacks via the optimal control design, and stability proofs are derived through Lyapunov theory. Then, two different simulation examples demonstrate the effectiveness of our proposed methods. Finally, a brief conclusion is given.

Let us consider the following benchmark power system:

Let

However, the attacks on the system are generally inevitable, which may affect the control performance. System dynamics (

Due to the existence of unknown attacks, it is difficult or even impossible to investigate the systems (

Define the performance index function as

The optimal value function can be defined as

According to the stationarity condition [

Thus, the key point to obtain the optimal control policy is to solve the HJB equation.

ADP is a powerful tool to solve the optimal control problems. Traditional ADP methods include two iterative algorithms: policy iteration (PI) and value iteration (VI). Afterwards, two noniterative RL methods are developed.

The aforementioned iterative ADP methods belong to the offline learning field because the value function and control policies are updated with the iteration index. Quite different from offline algorithms, online RL methods [

In the online RL methods, the update and delivery of information must be continuous, which causes a waste of communication resources. For this phenomenon, the event trigger-based RL methods [

By using the aforementioned ADP methods, we can obtain the optimal control form of the nominal system, which will be employed in the following sections.

To implement the proposed algorithms, a critic NN and an actor NN are employed to approximate the iterative value function and control policy:

Hence, the optimal value function and control policy have NN representation as

In previous works, the NN approximation error was rarely discussed. In this paper, we attempt to consider its effect in the stability analysis.

In Figure

Block diagram of proposed robust control.

First, let us consider the system (

If the positive definite matrices

Choose the Lyapunov function candidate as follows:

Substituting (

To guarantee

The proof is completed.

By using ADP methods, one can obtain the approximate optimal control policy. However, these ADP methods are finally implemented by NNs or other universal approximators, which will bring approximation errors. In the previous works, NN approximation errors were rarely discussed. In this paper, we attempt to present the corresponding error analysis.

When NNs finish learning, NN weights will achieve convergence. Based on (

By means of (

If the positive definite matrices

Utilizing the Lyapunov function candidate (

Through the result of (

It can be observed that if the NN weight approximation error

In this section, the proposed robust control schemes are modified and extended to deal with sensor attacks [

Consider the system (

The robust controller for (

If the matrix

Choose the Lyapunov function candidate as (

To guarantee

The proof is completed.

Note that the robust load frequency control problem is a special case of sensor attacks.

Let (

If the matrix

According to (

Let

From (

Consider the system (

Let

If the positive definite matrices

Construct a Lyapunov function candidate as follows:

After some mathematical derivation, equation (

To ensure

This completes the proof.

When NNs finish learning, the approximate optimal value function can be acquired:

Based on (

If the positive definite matrices

Employing the Lyapunov function candidate (

From (

In this section, to verify the proposed robust control strategy, two simulation examples of power systems are presented for two different types of attacks, respectively.

In this case, the actuator attack affecting the controller is considered in power system. The values of system parameters for this simulation are given in Table

Values of system parameters.

Parameters | |||||
---|---|---|---|---|---|

Values | 1 | 1 | 1 | 0.2 | 2 |

Let the initial system state values be

System states under actuator attacks.

2D plot of

In this case, the parameters are selected as

In this case, the designed controller is proved by numerical simulation results that it can effectively resist the sensor attacks. The values of system parameters for this simulation are given in Table

Values of system parameters.

Parameters | |||||
---|---|---|---|---|---|

Values | 0.5 | 0.5 | 0.5 | 1 | 1 |

Then, we can obtain system matrices

First, we present the simulation result without the attack compensator in Figure

System states without the attack compensator.

System states with the attack compensator.

Dynamics of the attack compensator.

This paper has integrated optimal control theory, RL, and NNs to address the robust control issues of a benchmark power system. The optimal control theory for nominal systems and state-of-the-art RL methods along with the NN implementations have been reviewed. Multiple types of attacks in power systems, such as actuator attacks, nonlinear sensor attacks, and constant sensor attacks, are discussed. Then, several robust control schemes have been designed for different types of attacks, respectively. The control parameters have been derived through the Lyapunov stability theory. Furthermore, the stability analysis with the NN approximation error, which is rarely discussed in the previous works, has been presented in this paper. Simulation results have demonstrated the effectiveness of our proposed schemes.

Data are available upon request to the corresponding author.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work was supported by the Science and Technology Foundation of SGCC (SGLNDK00DWJS1900036), the Liaoning Revitalization Talents Program (XLYC1907138), the Doctoral Scientific Research Foundation of Liaoning Province (2020-BS-181), the Natural Science Foundation of Liaoning Province (2019-MS-239), the Key R&D Program of Liaoning Province (2020JH2/10300101), the Technology Innovation Talent Fund of Shenyang (RC190360), and the Science and Technology Project of State Grid Liaoning Electric Power Company Limited (SGLNSY00HLJS2002775).