Neural Network Supervision Control Strategy for Inverted Pendulum Tracking Control

*is paper presents several control methods and realizes the stable tracking for the inverted pendulum system. Based on the advantages of RBF and traditional PID, a novel PID controller based on the RBF neural network supervision control method (PID-RBF) is proposed. *is method realizes the adaptive adjustment of the stable tracking signal of the system. Furthermore, an improved PID controller based on RBF neural network supervision control strategy (IPID-RBF) is presented.*is control strategy adopts the supervision control method of feed-forward and feedback. *e response speed of the system is further improved, and the overshoot of the tracking signal is further reduced. *e tracking control simulation of the inverted pendulum system under three different signals is given to illustrate the effectiveness of the proposed method.


Introduction
Inverted pendulum system has been widely investigated in the past few decades based on two important characteristics of high order and strong coupling, which are important problems in control field. And it is an unstable, nonlinear, and multivariable system. Inverted pendulum control methods have a wide range of applications in military, aerospace, robotics, and general industrial processes, such as balancing problems during robot walking, verticality issues during rocket launch, and attitude control issues during satellite flight. e RBF neural network learning control algorithm has been a hot topic in current academic research.
is algorithm can solve nonlinear problems, tracking problems, and external interference problems. erefore, using this method to study the tracking problem of the inverted pendulum is of great significance. e RBF (radial basis function) neural network controllers of the nonlinear system are designed based on proportion integration differentiation (PID), and these methods have good control results. Here are some related research results. In [1], it is presented that RBF network to estimate complex and precise dynamics mainly solves the problem of uncertainty and external interference in the context of complex space. is method is used to solve the problem of model uncertainty and input error. In [2], a neural network adaptive control algorithm with PID is proposed. e self-learning ability and self-adaptation uncertain system dynamic characteristics are used to significantly reduce the impact of resistance disturbance on speed.
e system has strong robustness under the parameter variations and external disturbances. In [3], a scheme which combines a proportion differentiation control and a RBF neural network adaptive control algorithm is used. Among them, it uses the PD control to track the trajectory of the end effector of the wire-driven parallel robot (WDPR). e RBF neural network control algorithm is applied to approximate parameters. e combination of these two methods reduces the approximation error, enhances the robustness, and improves the accuracy of the WDPR. In [4], a fuzzy logic-based offline control strategy of a single-wheeled inverted pendulum robot (SWIPR) is presented to study the error, system set time, and rise time, etc. In the end, a good control effect was achieved. In [5], RBF network is applied to reduce chatter and increase stability. In [6], it proposes a single-layer nonlinear controller to achieve that the inverted pendulum can be adjusted to a stable state from any initial position and achieves four degrees of freedom. Hossein [7] put forward a kind of improved PID control method based on RBF neural network. It shows that the algorithm has a better control effect than the traditional PID for the tension. In [8,9], adaptive control methods are applied to the input or output of nonlinear systems. ey are, respectively, applicable to a class of nonsmooth nonlinear systems and a class of multi-input multioutput nonlinear time-delay systems with input saturation. In [10,11], two different adaptive methods were proposed to solve the unknown and stochastic nonlinear tracking problems of nonlinear systems. And other scholars have studied the stabilization and convergence of the system [12][13][14]. Kumar et al. [15] use adaptive control technology to deal with tracking problems. At present, many researchers [16][17][18][19][20] have conducted in-depth studies on various inverted pendulum models using different control methods.
In fact, no matter what the control system is, the system will tend to be stable under the action of the controller. erefore, RBF neural network is used as a model for approximating uncertainty, to study error convergence, and achieved good convergence results in [21]. In [22], it is proposed to use the RBF algorithm to estimate the residual error, reduce the error through the design of the controller, increase the control signal, and obtain good heating requirements. In [23], the dynamic characteristics of machine system are explored and the characteristics of the RBF neural network are used to study the tracking problem. Under the large structure of the RBF neural network, different basis functions are selected for a comparative study to eliminate chattering in [24]. In [25], the RBF neural network implements self-feedback control, accurate prediction, and real-time control of reasonable data. It has improved tracking accuracy and estimated unmodeled dynamics and external interference issues in [26]. As more and more academic researchers understand the approximation characteristics of RBF, they add RBF neural networks to various fields to study the dynamic characteristics of different systems. It mainly solves the problems of nonlinearity, uncertainty, and external interference and uses the Lyapunov function to ensure the effectiveness of the algorithm, so that it reduces system errors and reaches a stable state [27][28][29][30][31][32][33][34].
In recent years, the footprint of PID applications can be seen in different fields. From simple PID control algorithm to complex PID algorithm control, it has played a role in different control fields. In [35,36], the PID control algorithm and operation rules are studied, respectively. e characteristics of the PID algorithm are explored, and the PID has a certain degree of adaptability through simulation. According to the characteristics of the PID algorithm, some scholars have studied the tuning of the PID controller [37,38]. Tuning PID parameters are used to optimize system performance according to actual conditions. As PID parameter tuning technology becomes more and more mature, some interested scholars use PID as a controller to study system stability and tracking issues [39][40][41][42]. In order to adapt PID to more situations, some scholars have launched the control research of fuzzy PID [43,44] and fractional PID [45,46]. It can be seen that the PID control algorithm is a relatively classic control method.
According to the references, we can find that the RBF neural network control algorithm basically uses the Lyapunov function to determine the stability conditions. In [47], based on the nonlinear U model, RBF neural network and PD parallel control algorithm are proposed. e Lyapunov function determines the conditions of system stability, and under this condition, the tracking effect has been improved. However, from the tracking effect, the error between the system output and the tracking signal is large, and the tracking situation with external interference is not considered. erefore, in this article, we consider these problems based on the inverted pendulum model to study its tracking problem.
In short, the control of the inverted pendulum model mainly includes three major control performances, namely, stability, accuracy, and rapidity. en, for the tracking problem of the inverted pendulum, the three comprehensive performances also need to be considered. erefore, we designed a supervision control method PID-RBF. We further design another supervision control strategy IPID-RBF. e stable tracking of the signal is achieved by supervision control strategy. In general, the main innovations of this paper include the following: (1) PID-RBF strategy ensures the stability of the system. e overshoot of the system is reduced and the robustness of the system is enhanced. In the case of interference, the parameters can be adjusted adaptively to control signal tracking.
(2) IPID-RBF strategy further solves the problem of large overshoot in the control process. e adjustment time of the system is further reduced. is strategy has strong anti-interference ability, fast stability, and small error with the tracking curve.
(3) In the control process, we can use the PID-RBF strategy to replace the traditional PID control strategy. is way can make the system overshoot smaller and system stability better. IPID-RBF strategy further improves the overall performance of the system. In the IPID-RBF control strategy, the system has a faster response speed, better stability, and robustness. e rest of the paper is organized as follows. e relevant control objective is presented in Section 2. Neural network supervision control design is presented in Section 3. e simulation study is discussed in Section 4. Finally, the conclusions are given in Section 5.

Control
Objective e inverted pendulum system can be difficult to control as the order increases. At the same time, the inverted pendulum system itself has the characteristics of complexity, instability, and nonlinearity. e inverted pendulum system is often used as an experimental project in real society. At the same time, the effectiveness of some control methods in the 2 Discrete Dynamics in Nature and Society introduction part has been verified by controlling the inverted pendulum system. erefore, it has important significance for the research of inverted pendulum. In order to study the signal tracking problem, we consider the inverted pendulum model based on PID and RBF neural network control. e inverted pendulum model is similar to [48]. e force analysis of the inverted pendulum system is shown in Figure 1.
In Figure 1, the external force exerted on the trolley and its moving displacement are indicated by symbols u and x, respectively. θ is expressed as the angle between the pendulum and the vertical direction. After differentiating its displacement is _ x that is the velocity of the trolley, and the friction coefficient b between it and the trolley and the guide rail phase. en we can obtain the resistance b _ x of the guide rail to the trolley in the horizontal direction.
In addition, the interaction between the trolley and the pendulum is decomposed into two forces perpendicular to each other in the vertical plane, where f V and f H are, respectively, used to represent the component force in the vertical and horizontal directions.
We use three motions to express the pendulum bar motion of an inverted pendulum: the horizontal motion of the center of gravity, the vertical motion of the center of gravity, and the rotation around the center of gravity. According to Newton's law of mechanics, we can get three equations of motion, ϕ � θ + π represents the angle between the pendulum and the vertical downward direction: Equation (1) is equivalent to the following equation: Equation (2) is also equivalent to the following equation: e resultant force in the horizontal direction of the trolley can be expressed as follows: Put (4) into (6), the external force u can be written as Similarly, substituting (4) and (5) into (3)yields Formulas (7) and (8) are nonlinear equations of motion of vehicle-mounted inverted pendulum system. In order to facilitate the control, linearize the system. Suppose that θ ≤ 20°is within the error range of keeping stability. Because ϕ � θ + π, θis so small; therefore, cos ϕ ≈ − 1, sin ϕ � θ, and _ ϕ 2 � _ θ 2 ≈ 0. After linearization, the system is transformed into the following mathematical model: Taking the Laplace transform in (9), one obtains By eliminating X(s) from the equation set (10), the transfer function of the trolley to the pendulum angle is obtained as follows: where q � (M + m)(I + ml 2 ) − (ml 2 ) is a constant.

PID-RBF and IPID-RBF Control Design.
Here, we introduce the PID-RBF control and IPID-RBF control. As we know, PID controller consists of three important parameters, which are proportional regulation coefficient k p , integral regulation coefficient k i , and differential regulation coefficient k d . e proportional regulation coefficient k p can change the response speed of the system and improve the regulation precision of the system. e integral adjustment coefficient k i can eliminate the residual error. e dynamic performance of the system can be improved by differential adjustment coefficient k d . As shown in Figure 2, different PID parameters have different response speeds and stability. When the response curve oscillates significantly, kp should be increased, ki should be increased, and k d should be smaller. When the error of the response curve is large, kp should be reduced, ki should be reduced, and k d should be increased appropriately. According to this method, the best parameters are selected to achieve the best control effect of the inverted pendulum system. Discrete Dynamics in Nature and Society RBF neural network is a three-layer feed-forward neural network, and the mapping from input to output is linear which greatly speeds up the learning speed and avoids the problem of local minima. RBF neural network supervision control is to study the traditional controller, adjust the weight of the network online, and make the feedback control input u p (k) tend to zero. e structure of the RBF neural network supervision control system is shown in Figure 3.
In the RBF network structure, the input signal of the network is taken as r(k), H � [h 1 , . . . , h m ] T is radial basis function vector, and Gaussian basis function is h j which is expressed as follows: where j � 1, . . . , m, b j is the base width parameter of node j, b j > 0, C j is the center vector of node j, e weight vector of the network is given by e output of RBF network is denoted by where m is the number of hidden layer neurons in the network. e control law is given by e performance indicators of the neural network adjustment are given by e approximation is as follows: . (17) e error caused by the approximation is compensated by weight adjustment.
e gradient descent method is adopted to adjust the weights of the network.
where η is the learning rate. α is the momentum factor, and we get the adjustment process of neural network weights as follows: (1) PID controller based on the RBF neural network supervision control method (PID-RBF) includes error signal e 1 (k), cumulative error signal k s�0 e 1 (s), current error, and last time error difference signal Δe 1 (k) � (e 1 (k) − e 1 (k − 1)) processing. e PID-RBF control law can be expressed as follows: e network input vector of the PID-RBF control is given by where T 1 represents sampling time, e 1 (k) � r(k) − y 1 (k), the input signal is r(k), and y 1 (k) is the output sequence of the system response, s � 0, 1, . . . , k.
e network output of the PID-RBF control is u n (k) � h 1 w 1 + · · · h j w j + · · · + h m w m .
According to (15), (20), and (22), we can express its total control law as  Discrete Dynamics in Nature and Society e weight adjustment process of RBF neural network supervision is expressed as (2) An improved PID controller based on RBF neural network supervision control strategy (IPID-RBF) includes error signal e 2 (k), cumulative error signal e 2 (k), current error, and last time error difference signal Δe 2 (k) � e 2 (k) − e 2 (k − 1) processing.

Control Algorithm Stability Analysis.
For the stability analysis of the control algorithm, we fully consider the performance indicators adjusted by the neural network E(k), equation (16) E(k) � (1/2)(u n (k) − u(k)) 2 . e equation of state for the free motion of a linear stationary discrete system is e discrete Lyapunov equation is where G is system matrix and P and Q are positive definite real symmetric matrices. e Lyapunov function is expressed as When V(0) � 0, x is the solution of the following state equation [47]: e increment of the Lyapunov function is (34), ΔV 1 [x(k)] can be expressed as Combining the previous error and weight adjustment methods, we can express e 1 (k + 1) as e gradient descent method is used to adjust the weight of the network which can be rewritten as Combining equations (36) and (37) and performance indicators E(k) � (1/2)(u n (k) − u(k)) 2 � (1/2)e 2 1 (k), we can get the following equation: where G 1 � (ze 1 (k)/zw(k)).

Simulation Studies
is section provides some simulations to show the inverted pendulum tracking effect of PID-RBF supervision control and IPID-RBF supervision control. We consider the inverted pendulum model and take the swing angle of the swing rod as the controlled object. Under zero initial conditions, 6 Discrete Dynamics in Nature and Society . (40) e transfer function of the inverted pendulum system is discretized by z transformation. e discretized object after z transformation is We consider k p � 300, k d � 100, k i � 500, ts � 0.001, η � 0.30, and α � 0.05.
In Figure 4, the chart is given a square wave signal using neural network supervision control and traditional PID control. Its amplitude is one. From the diagram, it can be seen that the amplitude oscillation of the neural network supervision control is smaller than that of the traditional PID controller. e IPID-RBF supervision control tends to be stable, fastest, and more gentle. Obviously, PID-RBF and IPID-RBF supervision control have stable speed and high accuracy compared with pure PID control.
In Figure 5, the graph shows the RBF neural network supervision control tracking the input square wave signal parameter curve, the u n is the RBF network supervision control online learning adjustment curve, and the u p is the PID adjustment curve in the RBF network supervision control; superposed curves u are the sum of u n and u p in RBF networks supervised control. Comparing two adjustments of three curves, because of the transformation of square wave signal from positive to negative, the value of RBF network supervision control online learning adjusting curve is changed from zero to positive, the PID adjusting curve in RBF network supervision control is negative at the jump instant, and the variation of the superposition curve is relatively smooth.
In Figure 6, the diagram is given step signal using RBF neural network supervision control and traditional PID control. Its amplitude is one. From the diagram, the amplitude oscillation of the RBF network supervisory control is smaller than that of the conventional PID control. PID-RBF supervision control is stable after 3.8 s. IPID-RBF supervision control is stable after 2.5 s. Obviously, IPID-RBF supervision control is more stable and accurate.
In Figure 7, the graph shows the RBF neural network supervision control tracking the input parameter curve of the step signal. RBF neural network supervision control online learning adjustment curve u n and superposition curve uchange trend are basically consistent.
In Figure 8, the graph is given sine wave signal r using RBF neural network supervision control and traditional PID control to track. Its amplitude is one. As we can see from the picture, the amplitude oscillation of the RBF network supervision control is smaller than that of the pure PID, and the effect of the RBF neural network supervision control is better than that of pure PID control in the time period from 0s to 20 s. ere is no error coincidence between the RBF neural network supervision control curve (y 1 ory 2 ) and the r curve. e RBF network supervision control has less error and better accuracy than the pure PID control. It can be clearly observed from the partially enlarged view that the IPID-RBF control has the highest coincidence and the best tracking effect.
In Figure 9, the diagram shows the RBF neural network supervision control tracking the parameter curve of the input sine wave signal. In RBF network supervision control, the change of PID regulating curve u p is small, and the change of u in RBF network supervision control is gentle.
From Figures 8 and 9, the RBF neural network supervision control online learning adjustment curve u n and the input signal r show the opposite trend change under the input sine wave signal. As the input signal increases, the adjustment curve u n decreases. e input signal r decreases, and the adjustment curve u n increases. e adjustment trend of u n is related to the change of weight w. In the process of adjustment, the value of h is positive, u n � j�m j�1 h j w j . When the value of the input signal is positive, the weight w changes j�m j�1 h j w j correspondingly and the value is negative; when the value of the input signal is negative, the weight w changes j�m j�1 h j w j correspondingly and the value is positive. Figure 10 shows that, for a given step signal with an amplitude of one, a given pulse-type disturbance amplitude is one. Interference time is 0.5 s. e pure PID and the RBF neural network supervision control are used for tracking, respectively. e disturbance is added when the time is 5 s in Figure 10. And the amplitude oscillation of the RBF network r input signal y PID output y 1 PID-RBF output y 2 IPID-RBF output Discrete Dynamics in Nature and Society supervision control is smaller than that of the pure PID control when the disturbance appears. en, the system can adjust to the stable state quickly after the disturbance disappears to realize the step signal tracking. Figure 11 shows the RBF neural network supervision control with known disturbance signal and step signal input.
ere are three large-scale adjustments in total: the adjustment of the system stability before the disturbance is added, the adjustment of the error caused by the disturbance compensation, and the readjustment of the system stability. In RBF network supervision control, the curves u n and u tend to be stable after online learning.
In Figure 12, the graph is given a square wave signal. Its signal amplitude is one. Given pulse-type disturbance amplitude is two, and the time duration is 0.5 s. When time is 5 s, the disturbance is added. When the disturbance appears, the amplitude oscillation of RBF network supervision control is smaller than that of the pure PID. It is observed in the picture that the system quickly adjusts to the stable state without oscillation and realizes the square wave signal tracking under the IPID-RBF control.
In Figure 13, the picture shows the RBF neural network supervision control with the input of square wave signal and known disturbance signal. In the diagram, there are three large-scale adjustments between 0 and 10 seconds, such as the adjustment of system stability before the disturbance is added, the adjustment of the error caused by disturbance compensation, and the adjustment of system stability.  Discrete Dynamics in Nature and Society Combined with the regulation of compensating disturbance in the case of step input, traditional PID and the RBF neural network supervision control are used in the regulation of compensating disturbance for the time of 0.5 s. After 0.5 s, the stability of the system is adjusted to stable tracking signal. Among them, the IPID-RBF control had the least amount of speed to adjust. In Figure 14, the graph is given sine wave signal. Its signal amplitude is one. Given pulse-type disturbance amplitude is two, and the time duration is 0.5 s. When time is 5 s, the disturbance is added. It can be seen that the deviation of y 1 or y 2 from r is less than that of the pure PID, and the curve of RBF neural network supervision control has a higher coincidence with the given sine signal after the disturbance disappears. Figure 15 shows that the disturbance amplitude is given to one, and the time duration is 0.5 s, which is monitored by RBF network with the input of sine wave. Compared with the unperturbed condition, the value of the online learning adjustment curve u n of RBF neural network supervision control is larger. ere are three large pulse-type jumps of u n and u p superposed curves u in RBF network supervisory control. e other time RBF network supervision control online learning adjustment curve u n and the RBF neural network supervision control superposition curve u change trend are basically consistent.
Combining Table 1      e IPID-RBF control can reach a stable state in a short time no matter whether there is no interference or interference.

Conclusions
From the simulation effect, the IPID-RBF control applied to the model has better performance. At the same time, the input signal tracking is well achieved. Under three different input signals, real-time tracking of the input signal is realized through online learning, and the error of the input and output is continuously adjusted, so that the system error eventually approaches zero. Compared with the traditional PID control, the IPID-RBF control has the best tracking effect with the input signal under three different input signals. It has improved the characteristics of traditional PID with low accuracy. Compared with the PID-RBF supervision control, the IPID-RBF control has smaller curve oscillation amplitude during the adjustment process, and the system reaches a stable state in a short time and has strong anti-interference ability. is algorithm has simple control and good tracking accuracy. erefore, the improved control algorithms have good robustness, and the stability of the system is good. Simulation graphics and data show that the IPID-RBF controls the controlled object through online learning to achieve online identification and control. It has high control accuracy, good dynamic characteristics, and anti-interference ability.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.

12
Discrete Dynamics in Nature and Society