An Improved Approach for Robust MPC Tuning Based on Machine Learning

A robust tuning method based on an artificial neural network for model predictive control (MPC) of industrial systems with parametric uncertainties is put forward in this work. Firstly, an efficient approach to characterize the mapping relationship between the controller parameters and the robust performance indices is established. As there are normally multiple conflicted robust performance indices to be considered in MPC tuning, the neural network is further used to fuse the indices to produce a simple label representing the acceptable level of the robust performance. Finally, an automated algorithm is proposed to tune the MPC parameters for the considered uncertain system to achieve the desired robust performance. In addition, the regulation of the pH value of the sewage treatment system is used to verify the effectiveness of the robust tuning algorithm which is described in this paper.


Introduction
Model predictive control (MPC) has been widely used in industrial communities due to its robustness, ability to tackle safety constraints, and inaccurate model [1,2]. As we all know, PID control is normally applied at the system's base layer, while the MPC controllers are usually employed at the supervisory layer [3]. In MPC applications, the prediction horizon, control horizon, and weighting matrices in the cost function will significantly affect the closed-loop performance of the controlled system, and thus, the selection of the aforementioned parameters becomes one of the most important tasks for MPC design [4]. As control systems become more and more complex, factors such as input and output coupling, external interference, and time delay make it even more difficult to achieve an effective MPC tuning.
e MPC tuning methods in existing industrial applications are mainly based on engineering experience or numerical methods, which greatly increases the blindness of the controller design and, at the same time, consumes a lot of computation time [5]. Besides, since the model is only an approximation of the real process, it is inevitable to suffer a certain level of uncertainty; robust MPC tuning becomes a necessity [6]. In [7], the authors proposed a robust tuning method of MPC, which is on the basis of the min-max optimization. Such an approach can handle the model uncertainty problem explicitly and, meanwhile, could preclude MPC controllers from choosing large prediction and control horizons so that the online calculation time is reduced. In [8], the authors reduced the number of effective tuning parameters by modifying the controller structure and redesigned the MPC cost function properly. In [9], two robust tuning strategies are put forward for SISO uncertain paper-making processes which incorporated the total variation specification to user-friendly performance indices. In [10], the authors further proposed a rapid tuning strategy based on the closed-loop system structure for MPC parameters for MIMO paper-making system with first-orderplus-dead-time subsystems and uncertain model parameters. In [11], by adopting the sequential procedure, the authors developed a tuning method that took the reachable trajectories of each operating point of the controlled system as the reference to pursue an improved robust performance. en, they applied the method to the electrical system to verify its feasibility. In [12], the authors proposed a tuning method using the worst-case control scenario, which is characterized by the Morari resiliency index and the condition number, and a nonlinear multiobjective performance criterion. e resulting constrained nonlinear optimization problem is solved with PSO. In [13], based on the measured data in various operating conditions, a novel approach of the real-time compensation of the asymmetric behavior was investigated, which leads to an improved control performance.
With the rapid development of AI technology lately, researchers have made some attempts in the application of machine learning techniques in the process of MPC tuning. In [14], the authors put forward a framework using machine learning to approximate the tuning experience of human experts along with a gradient-free optimization algorithm to tune the MPC parameters. In [15], the authors proposed an online tuning method for the MPC parameters using particle swarm optimization (PSO) and online sequential extreme learning machine (OS-ELM). In [16], the authors used the artificial neural network for the tuning of FS-MPC for power electronic converters and verified the effectiveness of a practical FS-MPC regulated voltage source converter (VSC) for uninterruptible power supply (UPS) system. In [17], a dynamic system based on the projection neural network (PNN), which is well known for the parallel computational capability, is established to optimize the objective function of MPC; thereby, the computational efficiency is improved significantly which makes the proposed control design more practical.
Note that although there already exist MPC parameter tuning methods using neural network and PSO, none of them focus on the parameter tuning of an uncertain system based on robust time-domain performance indices. As model-plant mismatch is unavoidable in industrial applications, it is of great importance to develop a tuning method to handle this uncertainty. Besides, compared with the frequency-domain indices, their time-domain counterparts are more familiar to the site engineers, and thus, it is also necessary to incorporate the time-domain indices in the MPC tuning.
Given such a new problem, this paper uses the machine learning technique to develop an MPC tuning method to deal with both parametric uncertainty and robust timedomain performance indices. e contribution can be summarized as follows: (i) A novel approach to characterize the relationship between the MPC parameters and the robust timedomain performance indices, e.g., worst-case overshoot, is established based on RBF neural network. According to the definition of parametric uncertainty and robust performance indices, the parameters of the neural network and the corresponding data acquisition method are also specifically designed. (ii) As there normally exist conflicts between different time-domain robust indices, it is difficult to specify a suitable target for MPC tuning, and thus, BP neural network is employed to fuse the indices to produce a scalar label representing the acceptable level of the robust performance, such that the MPC tuning problem can be efficiently solved via the PSO algorithm. e paper is organized as follows. Section 2 describes the structure of the model predictive controller and expresses the tuning problem. Section 3 provides a tuning algorithm for MPC parameters via machine learning. en, in Section 4, a real system is utilized to verify the effectiveness of the raised algorithm. Finally, the concluding remarks are given in Section 5.

Nominal Model and Model Uncertainty.
In this paper, we consider multiple-input multiple-output systems which can be expressed by the following discrete-time transfer function model: in which G ij (s) demonstrates the transfer function between the i th output and the j th input. Take the common FOPDT model structure for each subsystem following the industrial experience [9,10], which can be described as follows: where k p , τ p , and θ p denote the process gain, time constant, and time delay. Since G pij (s) is hard to be known accurately, a nominal model G 0ij (s) is defined to approach it, which can be expressed as e model parameters k 0 , τ 0 , and θ 0 are identified through the input or output data of the real process and are used to predict the state of the MPC controller. However, it is inevitable that G 0ij (s) is different from G pij (s), and to consider such a model mismatch, the parametric uncertainty is used, which refers to the difference in the model parameters: where i � 1, 2, 3, . . . , m, j � 1, 2, 3, . . . , n. Note that compared with other types of uncertainty specifications, parametric uncertainty is easier for site engineers to specify based on their knowledge of the controlled system and thus is employed in this work. Based on the considered parametric uncertainty, a set of possible models can be denoted as follows: : � G p (s): k pij ∈ k pij , k pij , θ pij ∈ θ pij , θ pij , τ pij ∈ τ pij , τ pij , |i � 1, 2, . . . , m, j � 1, 2, . . . , n .
Furthermore, the state-space model can be obtained as where u ∈ R is the input variable, y ∈ R is the output variable, x is the state-space vector, and k is the sampling instant.

MPC Formulation.
In industrial applications, the MPC cost function with constraints usually can be designed in the following way: where Q and R are controller parameters to be tuned, y ss is the reference signal of y, and P and M are prediction and control horizons. erefore, the predicted output value of the model can be expressed via the following matrix expression: in which T indicates the matrix transpose, and x(0) is the current state vector of the system. en, the MPC can be represented in the following quadratic programming problem based on which the control signals can be obtained: Y ss � y ss y ss y ss . . . y ss Mathematical Problems in Engineering 2.3. Tuning Problem. In this work, we need to tune Q and R to keep the system robustly stable against the parametric uncertainty and each output tracks its target with a fast and stable response. But there are many contradictions in achieving the target. For example, a small overshoot often causes a large settling time, while a small settling time can be associated with a large overshoot. To make a balance in the process, we need to find the most appropriate set of parameters.
We choose overshoot and settling time as the main performance measures for the MPC tuning because they are simple and well suited for end users to evaluate the control effect. In the future application, other control performance indices can be directly added according to the specific needs following a similar procedure.

Controller Tuning Framework Based on Machine
Learning. In this paper, an approach for adjusting the aforementioned MPC design parameters for system with parametric uncertainties is developed based on the machine learning technique, the overall framework of which is shown in Figure 1.

Robust Performance Calculation Based on Machine
Learning. Many actual industrial control applications obtain the desired closed-loop performance through a tuning process. at is essentially an optimization problem, which, however, is normally solved by a human (see Figure 2). In this work, we are going to propose a method to achieve the desired controller parameters in a similar way as the human expert. As the MPC cost function captures a cost-benefit relationship between multiple, competing objectives, the performance of the system being considered often cannot be analyzed explicitly. erefore, a common approach is to approximate it by exploiting the innate human ability to recognize different patterns. More specifically, the human expert acceptance level (denoted as ψ h ) of a given closedloop performance (denoted as χ) is considered as the tuning objective. en, the tuning problem can be expressed as where ω � Q R T presents the MPC tuning parameter and Ω indicates a set of admissible controller parameters. χ(ω) is the robust performance of the uncertain system given a selected ω. e explanation of ψ h is given in Figure 3. For the two curves with the same reference output, the blue curve has the performance we hope to obtain in the process of controller parameter tuning rather than the red curve. So its ψ h value is less than that of the red curve.
Note that as there is no explicit relationship between ω and ψ h , machine learning technique is adopted in this work to characterize the robust performance and the corresponding acceptance level. en, the tuning problem is approximated by where Γ is a feature extractor function which transforms outputs of the system (i.e., time-domain signals) into a vector consisting of relevant performance indices, and ψ ml is an approximation to ψ h . en, the tuning framework can be expressed as the diagram shown in Figure 4. Now, there are three major steps in the tuning of MPC parameters: (i) extract the robust performance of the system with parameter ω, (ii) obtain the human expert's acceptance level of the obtained performance, and (iii) optimize the MPC tuning parameter for the desired performance.

Robust Time-Domain Performance Calculation.
Given the perturbed systems in Π, the time-domain robust performance indices are employed to characterize the robust performance as they are more intuitive to end users in the industry. More specifically, the worst-case overshoot and settling time are considered, the definition of which is defined as follows.
Definition 1 (worst-case overshoot). e worst-case overshoot of a set of responses with the same final value is the maximum value of all the responses minus the final value divided by the final value.
Definition 2 (worst-case settling time). e worst-case settling time of a set of responses with the same final value is the maximum time required for all the responses to arrive and stay within a predetermined final percentage range. e calculation of worst-case performance is shown in Algorithm 1. e illustration of the abovementioned indices is shown in Figure 5.
Note that there may exist a certain relationship between the system model parameters and the robust time-domain performance indices, but such a relationship is implicit and cannot be expressed by a definite formula. us, we use an artificial neural network to establish the mapping relationship between the MPC controller parameters and the robust time-domain indices (i.e., Γ in equation (11)) for the system with parametric uncertainty.
Due to the large dimensionality of the inputs and outputs, local approximation network radial basis function (RBF) is selected to ensure that the network has a fast learning convergence speed, and according to the sample size, a standard RBF network or generalized RBF network can be selected. Here, the standard RBF network is taken as an example without a morbid problem.
In this paper, each sample of the RBF neural network contains a set of controller parameters q and r and the corresponding robust time-domain performance indices of uncertain system. In practice, if the overshoot or settling time is too big, such control is considered meaningless. In order to ensure the representativeness of the samples, the grid method is used to sample the performance parameters within the acceptable range, and then, the corresponding controller parameters are derived as the training database of the RBF network. If each index takes Δ numbers within its allowable range, there are 2mΔ training set samples in total. Finally, Δ * groups of controller parameters and their corresponding robust performance indices are randomly generated as test sets. If the training set samples cannot reach the required network accuracy, the segmentation of the above interval can be further refined.
Note that every group of worst-case performance indices needs 2 d (d is the number of uncertain parameters in the system model) curves to generate a reasonable result. Considering the requirement of the BPNN in this work, the 2mΔ groups of datasets are obtained when training RBF network is directly utilized, and therefore, it is necessary to have at least 2mΔ × 2 d curves to generate a reasonable result. e reason to consider 2 d curves is that in order to characterize the worst-case performance, the polyhedron system representation [3] in robust control theory is employed, which indicates that the worst performance of the uncertain system mostly appears at the vertex system of the polyhedron system, and therefore, the largest and smallest possible values of each model parameter of the uncertain system need to be considered, resulting in 2 d curves for each group of robust indices. Note that, compared with the existing method to evaluate the worst-case time-domain performance (e.g., brutal search method), the aforementioned method is much simpler, since the required number of curves is significantly reduced, which helps to achieve the network in a more efficient way. e input layer of the RBF network has two inputs, which represent the two parameters q and r of the controller, respectively. e number of neurons in the hidden layer is 2mΔ, which is the same as the number of samples in the training set; the number of neurons in the output layer is 2m, which represents the worst-case dynamic time-domain performances of the model uncertain system. Gaussian radial basis function is used in the network.

Controller parameters
Uncertain system  Figure 1: e overall framework of the proposed tuning method.

Plant
Controller Human Output Controller parameters  (4) obtaining the weight matrix of output layer by the recursive least square method. More specifically, the output function is given by in which ρ * is output variables, ρ δ is output variables (δ � 1, 2, 3 . . . 2mδ), ε δ is the weight from hidden layer node to output layer, and f RBF is the basis function of RBF network. e basis function of the RBF network in this paper is the Gaussian function. e specific expression is as follows: in which σ 2 is the variance of the Gaussian function and c i is the center of the Gaussian function.

Performance Label Calculation.
As there may exist a conflict between different robust performance indices, we employ a performance label ψ h to characterize the acceptance of a given pair of robust indices based on the experience of human experts. Note that although different expert's ψ h is likely to be personalized to some extent, the codomain of each human cost function to achieve ψ h is a set of elements representing the quantities' assessment of the relevant data. In this work, these data consist of the worstcase time-domain robust performance indices of the system, while the codomain elements are the nonnegative real numbers, referred to as performance labels. (1) Input: the uncertainty intervals k pij ∈ [k pij , k pij ], θ pij ∈ [θ pij , θ pij ], τ pij ∈ [τ pij , τ pij ], nominal system G 0ij (s) (i � 1, 2, 3 . . . m, j � 1, 2, 3 . . . n), controller parameter set Q data (δ) and R date (δ) (δ � 1, 2, 3 . . . 2mΔ), output reference Y ref ; (2) Output: ρ; (3) Divide the uncertain interval of k pij , θ pij , τ pij into η equal parts; (4) for δ � 1: 1: 2mΔ do (5) for l � 1: 1: 2 d do (6) Transform the transfer function (k pij (l), θ pij (l), τ pij (l)) into state space (A(l), B(l), C(l));   Mathematical Problems in Engineering For the purpose of demonstration, the performance label belongs to the interval [0, 1], capturing the "acceptable" level of the closed-loop performance. More specifically, label 0 denotes the best possible performance while a label greater than 0 denotes worsen performance, and 1 means the least "acceptable" performance. Naturally, any label greater than 1 captures the "unacceptable" closed-loop performance. e label assignments corresponding to these worst-case robust performance indices are obtained by collecting expert experience.
Given the dataset above, approximating the human cost function is a typical supervised learning problem. Note that there may exist a difference between the label values from each expert because he or she may have different preferences and concerns about characterizing the performance. erefore, the regression method is used to approximate the human cost function. In this work, BP neural network is used to establish the mapping relationship between the system performance indices and the label. e training data of the BP network are obtained by investigation. First of all, we generate a number of output curves to illustrate the considered robust time-domain indices (e.g., Figure 5), and these outputs are given performance labels ψ by experts. en, the required BP neural network can be established. e number of nodes in the input layer is 2m, which is decided by the output dimension of the system. e output layer has just one node, indicating the value of the label. We chose the number of nodes in the hidden layer by an empirical formula ������ 2m + 1 √ + z, where the value range of z is usually [1, 2, 3 . . . 10].
ere are 2mΔ inputs and output training sample vectors are represented by x δ and x δ , respectively (δ � 1, 2, 3 . . . δ). e input vector is x δ � (x δ1 , . . . , x δϱ ) T , ϱ � 1, . . . , I BP , the output vector of the network is o δ � (o δ1 , . . . , o δϱ ) T , ϱ � 1, . . . , J BP , and the target output vector is g δ . Note w ϱϵ is the weight of the ϱ component of the input vector mapped to the ϵ component of the output vector, which is randomly allocated in the first calculation. BP network modifies weight w ϱϵ by the gradient steepest descent method through the feedback of output result, so that the sum of square error between output value and the target value is minimum, that is, (14). Repeat until the error is less than the set threshold.
in which η BP is the learning rate of the network. e activation function of the BP network is the sigmoid function, i.e., 3.3. Tuning Process. As mentioned above, using the two trained neural networks, the performance label ψ * can be quickly obtained in the process of MPC parameter optimization. e specific algorithm for robust performance labeling is shown in Algorithm 2.
Based on Algorithm 2, an effective robust MPC tuning method is proposed in this section, which can be summarized in Algorithm 3.
Interpretations: the position μ of each particle contains all the information of Q and R matrices. e greater the particle swarm size N is, the larger the search range will be, the easier the global optimal solution would be obtained, but the longer the corresponding running time will be required. λ * is the maximum iteration number of particles, and ξ is the precision of the optimization target. c 1 , c 2 , and W are the local speed, global speed, and flight acceleration of particles, respectively. e faster the particles fly, the faster the optimization speed is, but it is also easier to miss the optimal location. e final pg * is the optimal control parameters of the MPC given the considered model uncertainty; ψ * (pg * ) is the corresponding performance label.

Industrial Example
is section applies the developed new tuning algorithm to the actual application of the pH adjustment process of sewage treatment system shown by Figure 6 to illustrate the efficiency of the tuning algorithms. e pH neutralization process system of sewage is composed of inlet wastewater flow, buffer fluid flow, and acid neutralizer flow in the neutralization tank to obtain the height of wastewater flow at the outlet and the liquid level of the storage tank. Among them, the flow rate of acid neutralizer flow and the flow rate of sewage flow at the inlet are taken as control variables, and the pH value of sewage flow at the output port and the height of the liquid level in the storage tank are taken as the output quantities. e model predictive controller is used, and the proposed method is employed to tune the relevant parameters of the controller to achieve the purpose of adjusting the pH value of sewage. e process can be characterized by a two-input twooutput system: Considering potential model-plant mismatch, the real model parameters are considered to be within the following ranges: (2) Output: worst-case performance label ψ * of the uncertainty system; (3) Obtain the worst-case system robust performance matrix ρ * with the control of MPC via RBFNN; (4) Obtain the performance label ψ * of the matrix ρ * via BPNN.  (1) Input: the uncertainty intervals k pij ∈ [k pij , k pij ], θ pij ∈ [θ pij , θ pij ], τ pij ∈ [τ pij , τ pij ], and nominal system G 0ij (s) (i � 1, 2, 3 . . . m, j � 1, 2, 3 . . . n); (2) Output: the optimal tuning results pg * ; (3) Initializing PSO optimizer parameters: N←2mΔ, c 1 ←2, c 2 ←2, w←0.2, narvs←m 2 + n 2 , t←0, λ * ←50, ξ * � 0.1; (4) Initializing the position μ and flying speed v of the particles; (5) t←0, ψ * (pg * )←∞; (6) while t ≤ λ * or ψ * (pg * ) ≤ ξ * do (7) t←t + 1; (8) for e � 1: 1: N do (9) ψ * (e)← Algorithm 1 (ψ * (μ)); (10) if ψ * (e) ≤ ψ * (pi(e)) then (11) pi(e)←μ(e); (12) end if (13) pg * � min(pt); (14) if

(19)
Note that the model is identified via an advanced industrial control software package for the use of MPC. e prediction and control horizons are set to P � M � 5, and the initial operating conditions are as follows: and the references in (8) For illustration purpose, the weighting matrix is simplified as follows: In our experiment, according to the requirements of the actual system for these indices, it is divided into Δ � 10 performance intervals in the range of change, and a sample is selected in each interval, a total of 10 × 4 � 40 samples as the training set. en randomly select Δ * � 15 groups from the remaining samples as the test set. And parameters of supervised machine learning are set as follows: (23) e root mean square error (RMSE) which can reflect the prediction stability of networks is introduced to evaluate the prediction performance of the network.
in which H r is the actual output and H p is the reference output of the network. In our experiment, the training accuracy of the RBF network and BP network is 0.001, and the error curves from the testing are shown in Figures 7-8. e RMSE of the RBF network and BP network is 0.8073 and 0.0951, respectively. Note that although the obtained error value of the RBF network is a bit high, such accuracy is acceptable for the robust tuning problem at hand because, given the high level of model uncertainty considered in this work, an average error of 0.1448 for worst-case overshoot and worst-case settling time would not affect the overall tuning performance from a robust control point of view. Furthermore, as the number of data samples of the industrial control system is normally limited, it is reasonable that the construction of the network is stopped once the accuracy meets the design requirement. As for the BP network, since the relationship between the robust indices and performance label is relatively simple, the accuracy becomes higher as shown in Figure 7.
en, the effectiveness of the proposed robust timedomain performance calculation method is tested and the results are shown in Figure 9, in which the red curves indicate the worst-case performance obtained by the RBF network from Algorithm 1, and the blue curves indicate the outputs generated with system parameters randomly selected in the region shown in equation (13). More specifically, the robust time-domain performance from the brutal force search is 0.11 0.0602 33 28 T , while that from the proposed network is 0.1477 0.0756 33.9222 29.1417 T .
(1) Input: the uncertainty intervals k pij ∈ [k pij , k pij ], and R date (δ) (δ � 1, 2, 3 . . . 2mΔ), output reference Y ref ; (2) Output: ρ; (3) Divide the uncertain interval of k pij , θ pij , τ pij into η equal parts; (4) for δ � 1: 1: 2mΔ do (5) for l � 1: 1: 2 d do (6) Transform the transfer function (k pij (l), θ pij (l), τ pij (l)) into state space (A(l), B(l), C(l)); (12) end for (14) end for Now, we test the proposed tuning method. Figure 10 shows the change of the performance label while Algorithm 2 is searching for the optimal MPC parameters. With the PSO iteration process going on, the label value decreases until it converges. e solution is 0.3426 and the corresponding tuning results are q � 8.57 and r � 4.54, and the optimization time is 23 s; Figure 11 shows the optimal control parameters obtained by the brutal search. e global optimal solution is q � 8.57 and r � 4.55, the corresponding Mathematical Problems in Engineering label level value is 0.3450, and the optimization time is 1.18h. Compared with the brute force search method, the tuning method based on machine learning requires only 0.5% of the running time which can still obtain a similar tuning result, which verifies the effectiveness of the method described in the invention.
In order to further verify the effectiveness of the method proposed in this paper, we selected three groups of controller parameters according to tuning guides utilized in industry and compared them with those obtained by Algorithm 3, and the tuning parameters of each case are shown in Table 1.
Assume that the actual parameters of the system are k 11 � −0.55, k 12 � 0.55, k 2 � 1.05, τ 1 � 90, τ 2 � 194, and θ 1 � 30, which are different from the nominal system in equation (14). e corresponding performance indices and corresponding ψ * of each group of controller parameters are shown in Table 2. It can be seen that the controller parameters corresponding to group (d), that is, the parameters obtained through the algorithm described in this paper, have the best control effect for the uncertain system.          In order to intuitively display the control effect of each group of controller parameters in Table 1, the output response curves are shown in Figure 12.

Mathematical Problems in Engineering
As far as we know, this is the first work to use the machine learning technique to solve the MPC tuning problem considering parametric uncertainty and robust time-domain indices. erefore, we can only compare with the most similar literature, [14] of this manuscript, which is a machine learning-based MPC tuning method developed for systems with no model uncertainty. According to [14], the MPC parameters are q * � 1.18 and r * � 8.43 when the controller is adjusted based on the nominal system without considering the model uncertainty. Assume that the actual model parameters of the system are k 11 � −0.60, k 12 � 0.44, k 2 � 1.05, and the nominal system is expressed by equation (19). Two groups of controllers are applied to the system, respectively, and the output image of the system is shown in Figure 13. e black curve represents the system output under the control of q * and r * , and its dynamic timedomain performance indices are overshoot and settling time of two outputs of the system, i.e., 1.17% 3.02% 40s 35s T . e red curve represents the system output under the control of the controller parameters q � 8.57 and r � 4.54 which tuned based on the worst-case time-domain performance indices of uncertain system that discussed in this paper, and its dynamic time-domain performance indices are 1.08% 2.86% 22s 18s T . It can be seen that the method proposed in this paper greatly enhances the robustness of parameter tuning of model predictive control for uncertain systems.

Conclusion
In this paper, an artificial neural network-based model predictive control (MPC) parameter tuning method for uncertain systems is proposed, which could solve the problem of low control accuracy caused by model mismatch and external disturbance effectively, so as to improve the robustness of the controlled system. To achieve such an objective, a novel method to compute the worst-case output performance in the time domain is developed through machine learning, and the potentially conflicted time-domain robust performance indices are transformed into a scalar performance label via a neural network formed based on expert's experience. At the same time, the parallel search ability of PSO is adopted to research the optimal tuning parameters efficiently. Finally, the method is verified in the process of pH regulation in the sewage treatment system.
At the present stage, the potential shortcoming of the method is that the considered two neural networks can only be obtained offline, and for some industrial systems, it is more desirable that the offline training stage can be avoided and the learning can be realized in an online manner. us, the main future research direction is to employ the online learning method to develop an efficient robust tuning algorithm considering model uncertainty as well as robust time-domain performance.

Data Availability
No data were used to support this study.  Output with control of q * and r * Output with control of q and r Output reference (b) Figure 13: e comparison between the method proposed in this paper and reference [14].