Research Article A Multivariable Adaptive Control Approach for Stabilization of a Cart-Type Double Inverted Pendulum

This paper considers the design and practical implementation of linear-based controllers for a cart-type double inverted pendulum (DIPC). A constitution of two linked pendulums placed on a sliding cart, presenting a three Degrees of Freedom and single controlling input structure. The controller objective is to keep both pendulums in an up-up unstable equilibrium point. Modeling is based on the Euler-Lagrange equations, and the resulted nonlinear model is linearized around up-up position. First, the LQR method is used to stabilize DIPC by a feedback gain matrix in order to minimize a quadratic cost function. Without using an observer to estimate the unmeasured states, in the next step we make use of LQG controller which combines the Kalman-Bucy filter estimation and LQR feedback control to obtain a better steady-state performance, but poor robustness. Eventually, to overcome the unknown nonlinear model parameters, an adaptive controller is designed. This controller is based on Model Reference Adaptive System (MRAS) method, which uses the Lyapunov function to eliminate the defined state error. This controller improves both the steady-state and disturbance responses.


Introduction
The nonlinear systems like the classic inverted pendulum have been widely used as a test bed in control laboratories to investigate the effectiveness of control methods on real systems 1, 2 .The cart-type Double inverted pendulum DIPC is an extension for the single inverted pendulum SIP system.The control problem is more difficult and challenging, because the controller must bring both pendulums, from the stable equilibrium hanging point to the upup unstable equilibrium point and keep the system state around this point.
The problem of controlling DIPC is separated into two stages: a swing-up control and a balancing control strategy 3, 4 .Common approaches to balance control of DIPC are based on stabilizing the system by a feedback gain matrix which is also used in this paper.Optimal and SDRE methods, Neural Networks, GA, and other model-based and intelligent algorithms are widely used to adjust this gain matrix 1, 2 .Most of these methods are computationally expensive and require perfect modeling or training processes.Also some model-based robust approaches such as H 2 and H ∞ optimization are utilized to overcome the uncertainties due to modeling imperfectness.In this paper we take the optimal solution as our baseline and try to have the least changes while passing from simulation to practical running by using an adaptive method to adapt the optimal gain against the upcoming uncertainties.The proposed approach is the so-called lyapunov-based MRAS adaptive control.Designing procedure in this method is simple and does not involve excessive computations, though easily track the uncertainties.
In order to design the adaptive controller, primarily we use the LQR method where, a quadratic performance criterion is considered for designing an optimal controller.It is proved that, this performance index can be minimized by a constant feedback gain matrix, which is the solution of Riccati equation 5 .In addition, a Kalman-Bucy filter as a modification for LQR is used to predict the absent states yielding the so-called LQG method.In the next step, an adaptive controller is designed based on MRAS approach in which the stabilized closed loop system, produced by LQR method, has been declared as a reference model and to achieve a parameter adjustment law, a Lyapunov function is introduced to eliminate the state error.
We consider the disturbance response and steady-state behavior of the system as two factors to investigate the efficiency of LQR and adaptive controllers.In steady-state situation, DIPC acts almost as a linear system however, nonlinearities like stick-slip friction causing unwanted behaviors like limit cycle 6 result in a poor steady state behavior.However, under deviated situation, the intrinsic nonlinearities of DIPC are dominant which make the closed loop system unstable.Discussed nonlinearities are not considered in designing LQR controller, because it deals with linear models.To cope with these problems designing of two separate adaptive controllers for steady-state and deviated conditions is proposed, because the major nonlinearities are different in these two conditions.

Double Inverted Pendulum System
A schematic view of a mechanical DIPC system is depicted in Figure 1.The first pendulum is placed on a cart and the second pendulum pivots on the first one.The cart can move freely along a horizontal track and a force u exerted to the cart in order to balance the whole system.Some usual assumptions have been made to simplify the modeling of the system, that is, the masses of the pendulums and the cart are homogeneously distributed and concentrated in their centers of gravity and we neglect frictions.Though, the latter causes some difficulties in practical running of the system.

Modeling
According to the schematic depiction of DIPC system, the mathematical model is derived using the Lagrange method, assuming that thereis negligible damping between mechanical where where

2.3
Assuming that centers of mass of the pendulums are in the geometrical center of the links, which are solid rods, we have The nomenclature and parameters of D θ , F θ, θ , and G θ are given at the end of the paper.

Control
The control system is designed to stabilize the pendulums in the up-up position.Therefore, the designed controllers are the regulatory type, which force the states to remain near zero.The controllers discussed in this paper are based on linear model.
To design a control law we introduce the state vector as The Lagrange equations of motion 2.1 can be reformulated into a sixth-order system of ordinary differential equations: The required linear model can be achieved by the linearization of above equation around x 0 which yields ẋ Ax Bu, y Cx,

Linear Quadratic Regulator (LQR)
The linear system in 2.7 can be stabilized using a linear control law:

2.9
In order to design an optimal control law, K must be calculated such that a given performance criterion or cost function be minimized.A quadratic cost function is proposed for DIPC stabilization as where Q and R in 2.10 are the states and control weighting matrices and are chosen to be square and symmetric.
It is proved 5 that minimization of the quadratic cost function for a linear system, has a mathematical solution which yields in an optimal control law of the form 2.9 and is given by where P is the solution of Riccati equation: and K is the optimal linear feedback gain

Linear Quadratic Gaussian (LQG)
An important issue on state feedback controllers is to obtain the states of the system in order to produce an input signal.But some states may not be available so that, some kind of an observer system is required to predict the states.This may be obtained by a pole placement procedure but another problem arises when our measurements and input signal s are infected with noises, then it would be a convergence problem using ordinary observer.In this case an optimal observer called Kalman-Bucy filter is proposed.Suppose the noisy system is modeled as ẋ Ax Bu w 1 t , y Cx w 2 t ,

2.14
where w 1 and w 2 denote input and measurement noises, respectively, with covariance matrices Q 0 and R 0 .Using an observer the system's equation of predicted states would be as follows: where x is the predicted state and L is the correction gain of prediction.The problem of designing an optimal observer, then reduces to find a gain, L, to minimize the quadratic cost function for state error, e t : e t x t − x t .

2.16
Desired value of L, is given by Mathematical Problems in Engineering with P 0 , satisfying the Riccati equation: Now with optimal state feedback gain, K, and optimal observer gain, L, derived separately, according to the so-called Separation Principal, the LQG controller could be constructed.

Adaptive Controller
The goal of the adaptive controller used in this paper for the DIPC system which follows a Model Reference approach, is to modify a feedback gain matrix, K, such that the behavior of the main system which has been infected with various kinds of uncertainties, tend to that of a desired and deterministic closed loop system, so that the uncertainties would be compensated with an appropriate feedback gain.Here the reference system is the deterministic mathematical model of the DIPC that is controlled with the feedback gain obtained by LQR method.

Model Reference Adaptive System (MRAS)
It is desirable that the system ẋ Ax Bu 3.1 behaves as the reference system ẋm A m x m 3.2 and this should be performed with where K ad is the parameter to be adjusted.Thus, the systems 3.1 and 3.2 could be written as

3.4
Naturally the next step is to define an error term to illustrate the efficiency of the adaptation task and then a Lyapunov function is introduced to stabilize the error dynamics 7 .The error is defined as e x − x m .

3.5
Differentiating e, gives ė ẋ − ẋm Ax Bu − A m x m 3.6 and some manipulation on 3.6 results in the error dynamic: where To find a relation between adjustment of K ad , and elimination of e, a quadratic Lyapunov function is proposed: where P and ζ are positive definite matrices and thus V is also positive definite.Differentiating V , gives the criterion such that V , represents a Lyapunov function: where Q is positive definite such that A T m P PA m −Q.

3.11
Since A m is stable there always exists a pair of positive definite matrices P and Q.Now if we choose Kad −ζΨ T Pe 3.12 3.10 would be negative definite and hence, the error goes to zero by time.
Ultimately, combining 3.12 with 3.8 the parameter adjustment law becomes Kad ζxB T Pe.

3.13
This shows an iterative adaptation: where T s denotes the sampling period and ζ is a weighting factor whose effect will be addressed later.

Practical Implementation
The last step is to apply the discussed controllers to the practical system of DIPC.A constructed system is depicted in Figure 2 this system is mounted on the Robotic Lab of electrical and computer faculty in Tabriz university .For the aim of the forced motion, a rubber belt is attached to the cart and a 144 W PMDC motor drives it which is controlled via a PWM driver.Three shaft encoders are the only sensors applied and measure the positions of the cart and pendulums.Not using any gearbox, backlash phenomena is not observed, also, using ball-bearings, relaxed us from considering friction in rotating parts.However, horizontal displacement, face some nondeterministic friction.The structure of DIPC control system is also depicted in Figure 3.The interface between hardware and software parts of the system is a National Instruments PCI-6601 data acquisition card which provides encoder readers and I/O ports.The software used for controlling the system is implemented in xPC target toolbox from MATLAB.No discretization is used at all and the controller implemented continuously with a sampling frequency of 50 KHz.A primary issue to be regarded is obtaining the states of the system which, three of them already have been measured using incremental shaft encoders, but the rest, namely, the velocities of the parts are still absent.That is without using Tacho sensors; a derivation process is required to produce the rest of states which is a challenging task because of the quantized output of the position sensors.Except the LQG which uses an optimal estimator to produce the states, a direct difference approximation combined with prefilters used for smoothing is used in this paper, where a cut-off frequency for filtering is obtained by trial and error.

4.1
To apply the LQR controller the weighting matrices, Q and R which are used in 2.12 and 2.13 are chosen as Q diag 1000, 50, 50, 20, 700, 700 , R 1 4.2 which primarily obtained from 1 and then modified by trial and error to satisfy the conditions of practical implementation for example, fast disturbance rejection and appropriate steady-state behavior.The resulted feedback gain matrix is then  For application of LQG controller we will introduce some remarks on effects of shaft encoders and process cycle on system.Incremental position sensors, using on-off optic method to produce the position data, provide limited resolution on output, namely, 0.1 degree in our case.This also appears on input section, providing resolution of approximately 0.01 V m .These noncontinuous signals cause major problem and difficulty in derivation process producing spikes on velocity signals, which is the main source of undesirable chattering on steady-state behavior.The so-called chattering problem could be solved by Kalman filter estimation of states, regarding the limited resolution discussed above as noises according to 2.14 .Covariance matrices required in 2.17 and 2.18 are derived by using "Quantizer" blocks in Simulink.That is, subtracting the desired signals of simulations with and without this blockgives the augmented noise signal and then it is easy to calculate the Mathematical Problems in Engineering covariances required.Results are as follows: And finally for implementation of adaptive controller, we consider the closed loop system obtained by LQR method on an ideal mathematical model, as a reference model.So The adaptive controller by adjusting the feedback gain according to 3.14 and initially tuned as K lqr , eliminates the state error between the practical system and the ideal mathematical reference model.In 3.14 , ζ is adaptation gain which determines the convergence rate of adjusted parameters and is tuned as follows.
As discussed earlier, DIPC shows different nonlinear characteristics in steady-state and deviated situations, consequently using two separated adaptive controllers is rational.In steady-state situation, gently changing of system states implies that the convergence rate must be small.Thus the adaptation gain is chosen as however, in deviated condition, due to the fast changes in system states, a higher adaptation gain chose

Experimental Results
The practical results of applying discussed controllers on DIPC are depicted on Figure 4-Disturbance Reponses-and Figure 5-Steady State Behavior-which clearly illustrate the dominance of adaptive controller in stabilizing the pendulums.The major problem with the LQG controller is lack of robustness; that is why it shows poor disturbance response.However, compensating the shortcoming of encoders which was a major cause of chattering brings out the LQG method as an effective control strategy in nondeviated working mode which is obvious in Figure 5. Comparing adaptive controller with LQR, it is clear that adaptive controller especially for pendulums, has a shorter settling time.It is also possible to make this time even smaller by adjusting adaptation gain, but it may reduce the stability range of the system.
According to Figure 5 and as mentioned previously, the effect of spikes on control input and correspondingly on position graphs is clearly visible.Another observation is the semiperiodic behavior of pendulums.Although an exact mathematical analysis is missing here but we know that uncertainties caused by modeling inefficiencies, majorly on friction, are the reason of getting such nonasymptotic however bounded results.A complete discussion is in 6, 8 .
Excepting the LQG which has shown poor efficiency at all, the steady-state error of LQR is approximately twice the adaptive controller's error for pendulums.Generally by taking into account both disturbance and steady-state responses, adaptive controller shows superior performance than LQR.

Conclusion
Three linear model-based controllers were designed and implemented in this paper.The LQG method also had a fantastic steady-state behavior but generally, it is not appropriate for controlling DIPC.The LQR method presenting an average efficiency was used as a baseline to demonstrate the advantages of adaptive controller.
DIPC intrinsically is a highly nonlinear system.Moreover, there are other unmodeled phenomena such as friction, motor nonlinearities, and belt elasticity.That affects the system behavior which is not considered in designing of LQR controller; however, detailed modeling of physical system is a laborious task.So, design of a model reference adaptive controller MRAS is carried out: initiating with LQR and adapting itself by time.This controller shows a satisfactory response in both steady-state and deviated conditions.It also uses less energy than the LQR in the real system.Moreover, by estimation of the optimized feedback gain, one may apply this gain directly to the system.
Ultimately, we will remark that, when DIPC is deviated from its linear region around up-up position, the intrinsic nonlinearities are dominant whereas around the linear region, other nonlinearities such as friction become important.This leads to the development of two independent adaptation processes for each steady-state and deviated conditions.Also an approach to obtain better results is to apply a dynamic adaptation gain, which results in a faster convergence rate of adjusted parameters and may be regarded as future task.

m 0 :
Equivalent mass of the cart system 0.71 kg m 1 : Mass of first pendulum 0.35 kg m 2 : Mass of second pendulum 0.2 kg l 1 : Distance from a pivot joint to the first pendulum center of the mass 0.277 m l 2 : Distance from a pivot joint to the second pendulum center of the mass 0.176 m L 1 : Total length of first pendulum 0.4 m L 2 : Total length of second pendulum 0.35 m I 1 : Moment of inertia of first pendulum 0.0145 kg • m2 I 2 : Moment of inertia of first pendulum 0.007 kg • m2 θ 0 : Wheeled cart position θ 1 : First pendulum angle θ 2 : Second pendulum angle u: Control force g: Gravity constant.