Optimal Bounded Control for Stationary Response of Strongly Nonlinear Oscillators under Combined Harmonic and Wide-Band Noise Excitations

We study the stochastic optimal bounded control for minimizing the stationary response of strongly nonlinear oscillators under combined harmonic and wide-band noise excitations. The stochastic averagingmethod and the dynamical programming principle are combined to obtain the fully averaged Itô stochastic differential equations which describe the original controlled strongly nonlinear system approximately. The stationary joint probability density of the amplitude and phase difference of the optimally controlled systems is obtained from solving the corresponding reduced Fokker-Planck-Kolmogorov FPK equation. An example is given to illustrate the proposed procedure, and the theoretical results are verified by Monte Carlo simulation.


Introduction
The well-known tool to solve the problem of stochastic optimal control is the dynamical programming principle, which was proposed by Bellman 1 .According to this principle, a stochastic optimal control problem may be transformed into the problem of finding a solution to the so-called Hamilton-Jacobi-Bellman HJB equation.However, in most cases, the HJB equation cannot be solved analytically 2 , especially in the multidimensional case.A powerful technique to solve stochastic optimal control problems is to combine the socalled stochastic averaging method 3 with Bellman's dynamic programming principle.The idea is to replace the original stochastic system by the averaged one and then to apply the dynamic programming principle to the averaged system.It has been justified that the optimal control for the averaged system is nearly optimal for the original system 4 .This combination strategy has two notable advantages.Firstly, the dimension of the averaged system can be reduced remarkably, and the corresponding HJB equation is low-dimensional.Secondly, the diffusion matrix of the original stochastic system is usually singular, while the averaged one is usually nonsingular.This unique characteristic enables the HJB equation for the averaged system to have classical rather than viscous solution.In recent years, a notable nonlinear stochastic optimal control strategy has been proposed for the control of quasi-Hamiltonian systems under random excitations by Zhu based on the stochastic averaging method for quasi-Hamiltonian systems and the stochastic dynamical programming principle 3 .Examples have shown that this strategy is very effective and efficient.
In practice, the excitations of dynamical systems can be classified as either deterministic or random.The random excitation is often modeled as Gaussian white noise, wideband colored noise, or bounded noise.On the other hand, the magnitudes of the control forces are usually limited due to the saturation in actuators; that is, the control forces are bounded.The optimal bounded control of linear or nonlinear systems under random excitations has been studied by many researchers 5-8 .In all these studies, the excitations of the systems are random excitations alone.However, many physical or mechanical systems are subjected to both random and deterministic harmonic excitations.For example, such combined excitations arise in the study of stochastic resonance 9 or uncoupled flapping motion of rotor blades of a helicopter in forward flight under the effect of atmospheric turbulence 10 .Due to the existence of the deterministic harmonic excitation, the response of the stochastic system is not time homogeneous and the stationary behavior is not easy to capture.Stochastic averaging method is such a method that can approximate the original system by time-homogeneous stochastic processes.By using stochastic averaging method, many researchers have studied the linear or strongly nonlinear systems under combined harmonic and wide-band random excitations 11-16 .On the other hand, optimal bounded control of a linear or nonlinear oscillator subject to combined harmonic and Gaussian white noise excitations has been studied 17-19 .However, Gaussian white noise is an ideal model.In most cases, the random excitation should be modeled as wide-band colored noise.So far, little work has been done on the optimal bounded control of a strongly nonlinear oscillator subject to combined harmonic and wide-band noise excitations 20 .
In the present paper, a procedure for designing optimal bounded control to minimize the stationary response of a strongly nonlinear oscillator under combined harmonic and wide-band colored noise excitations is proposed.In Section 2, based on the stochastic averaging method, the equation of motion of a weakly controlled strongly nonlinear oscillator under combined harmonic and wide-band noise excitations is reduced to partially averaged It ô stochastic differential equations.In Section 3, a dynamical programming equation for the control problem of minimizing response of the system is formulated from the partially averaged It ô stochastic differential equations by applying the dynamical programming principle.The optimal control law is determined by the dynamical programming equation and the control constraint.In Section 4, the reduced FPK equation governing the stationary joint probability density of the amplitude and phase difference of the optimally controlled system is established.In Section 5, a nonlinearly damped Duffing oscillator under combined harmonic and wide-band noise excitations is taken as an example to illustrate the application of the proposed procedure.All theoretical results are verified by Monte Carlo simulation.

Partially Averaged It ô Stochastic Differential Equations
Consider a weakly controlled strongly nonlinear oscillator subject to weak linear or nonlinear damping and weak external or parametric excitation by a combination of harmonic functions and wide-band colored noises.The equation of motion of the system has the following form where the term g X is stiffness, which is assumed as an arbitrary nonlinear function of X. ε is a small parameter; εf X, Ẋ, Ωt accounts for light damping and weak harmonic excitation with frequency Ω; ε A, Φ, μ, Θ, and ν are all stochastic processes.P x is the potential energy of the system 2.1 .The functions cos Φ t and sin Φ t are called generalized harmonic functions.Obviously ν A, Φ is the instantaneous frequency of the system 2.1 .When ε 0, 2.1 degenerates to the following nonlinear conservative oscillator: Ẍ g X 0.

2.6
The average frequency of the conservative oscillator 2.6 can be obtained by the following formula:

2.7
Then the following approximate relation exists: Treating 2.2 as generalized Van der Pol transformation from X, Ẋ to A, Θ , one can obtain the following equations for A and Θ: 2.9 where

2.10
According to the Stratonovich-Khasminskii limit theorem 22, 23 , A and Θ converge weakly to 2-dimensional diffusive Markov processes in a time interval of ε −1 order as ε → 0, which can be represented by the following It ô stochastic differential equations: where B k t are independent unit Wiener processes,

2.12
System 2.1 has harmonic excitation and two cases can be classified: resonant case and nonresonant case.In the nonresonant case, the harmonic excitation has no effect on the first approximation of the response.Thus, we are interested in the resonant case, namely, where m and n are relatively prime positive small integers and εσ is a small detuning parameter.In this case, multiplying 2.13 by t and utilizing the approximate relation 2.8 yield Introduce a new angle variable Γ such that which is a measure of the phase difference between the response and the harmonic excitation.Then 2.14 can be rewritten as

2.16
Using the It ô differential formula, one can obtain the following It ô stochastic differential equations for A, Γ, and Φ:

2.17
Obviously, A and Γ are slowly varying processes, while Φ is a rapidly varying process.Averaging the drift and diffusion coefficients in 2.17 with respect to Φ yields the following partially averaged It ô stochastic differential equations:

2.19
Herein, • Φ denotes the averaging with respect to Φ from 0 to 2π • b ij are diffusion coefficients.
Note that there are two procedures of averaging in this section.One is stochastic averaging, and the other is deterministic time averaging.The procedures for obtaining S i and b ij in 2.12 are called stochastic averaging.The procedures for obtaining m 1 i and b ij are called deterministic time averaging.To complete averaging and obtain the explicit expressions of S i and b ij , the functions F 1 i and U ik in 2.12 are expanded into Fourier series with respect to Φ and the approximate relation 2.8 is used.

Dynamical Programming Equation and Optimal Control Law
For mechanical or structural systems, A t is the response amplitude and Γ t is the phase difference between the response and the harmonic excitation.They represent the averaged state of system 2.1 .Usually, only the response amplitude is concerned, so it is meaningful to control A t in a semi-infinite or finite time interval.Consider the ergodic control problem of system 2.18 in a semi-infinite time interval with the following performance index: Herein, f is a cost function.Based on the stochastic dynamical programming principle 24 , the following simplified dynamical programming equation can be established from the first equation of 2.18 : where V A is called value function; min u∈U • denotes the minimum value of • with respect to u; η is optimal performance index; u ∈ U indicates the control constraint.

Mathematical Problems in Engineering 7
The optimal control law can be determined from minimizing the right-hand side of 3.2 with respect to u under control constraint.Suppose that the control constraint is of the form where u 0 is a positive constant representing the maximum control force.The control u enters the performance index 3.2 only through the term This expression attains its minimum value for u ±u 0 , depending on the sign of the coefficient of u in 3.4 .So the optimal control is where sgn denotes sign function.
It is reasonable to assume that the cost function f is a monotonously increasing function of A t because the controller must do more work to suppress larger amplitudes A t .Then dV A /dA is positive.Under the specified conditions 21 , g A B and 1 h are both positive.Equation 3.5 is reduced to Equation 3.6 implies that the optimal control is a bang-bang control.u * has a constant magnitude u 0 .It is in the opposite direction of Ẋ t and changes its direction at Ẋ t 0. For the optimal bounded control of system 2.18 in a finite time interval, the following performance index is taken: where t f is the final control time and χ is the final cost function.E • denotes the expectation operator.The following simplified dynamical programming equation can be established from the first equation of 2.18 : where V V A, t is the value function.Equation 3.8 is subject to the following final time condition: It is obvious that the same optimal control law as that in 3.6 can be derived if the control constraint is of the form of 3.3 .
Note that the optimal control for the averaged system 2.18 is nearly optimal for the original system 2.1 .For simplicity, here it is called optimal control for both original and averaged systems.

Stationary Response of Optimally Controlled System
Inserting u * from 3.6 into 2.18 to replace u and averaging F 2 i , the following fully averaged It ô stochastic differential equations for A and Γ can be obtained: where

4.2
The reduced FPK equation for the optimally controlled system is of the following form where p p a, γ is the stationary joint probability density of the amplitude A and the phase difference Γ.Since p is a periodic function of γ, it satisfies the following periodic boundary condition with respect to γ: p a, γ p a, γ 2π .

4.4
The boundary condition with respect to a 0 is p finite at a 0 4.5 which implies that a 0 is a reflecting boundary.The other boundary condition is p, ∂p ∂a −→ 0 as a −→ ∞.

4.6
In addition to the boundary conditions, the stationary joint probability density p a, γ satisfies the following normalization condition: 2π 0 ∞ 0 p a, γ da dγ 1.

4.7
Usually, the partial differential equation 4.3 can be solved only numerically.

Example
To illustrate the proposed strategy in the previous sections, take the following controlled nonlinearly damped Duffing oscillator as an example.Duffing oscillator is a typical model in nonlinear analysis.The equation of motion of the system is of the form where β 1 , β 2 , ω 0 , α, E, and Ω are positive constants; u is the feedback control with the constraint defined by 3.3 ; ξ k t k 1, 2 are independent stationary and ergodic wideband noises with zero mean and rational spectral densities 5.2 ξ i t can be regarded as the output of the following first-order linear filter: where W i t are Gaussian white noises in the sense of Stratonovich with intensities 2D i .Note that the maximum value of S i ω is D i / πω 2 i .It is assumed that β 1 , D i / πω 2 i , and E are all small.
For the system 5.1 , the instantaneous frequency defined by 2.4 has the following form:

5.4
ν A, Φ can be approximated by the following finite sum with a relative error less than 0.03%: where 1/2 λ 3 64 .

5.6
Then the averaged frequency ω A of the system 5.1 can be approximated by b 0 A .In the case of primary external resonance, Introduce the new angel variable defined by 2.15 , and complete the procedures shown in Sections 2-4 we obtain the following fully averaged It ô stochastic differential equations for A and Γ: where m i and b ii i 1, 2 are drift and diffusion coefficients, respectively.m i and b ii are given in the appendix.The stationary joint probability density p a, γ of the optimally controlled system 5.1 is governed by the following reduced FPK equation: Solving the FPK equation 5.9 by finite difference method under the conditions 4.4 -4.7 , the stationary joint probability density of the amplitude and the phase difference of the optimally controlled system 5.1 can be obtained.Furthermore, the stationary mean amplitude of the optimally controlled system 5.1 can be obtained as follows: To check the accuracy of the proposed method, Monte Carlo digital simulation of the original system 5.1 is performed.The sample functions of independent wide-band noise ξ i t were generated by inputting Gaussian white noises to the linear filter 5.3 .The response of system 5.1 was obtained numerically by using the fourth-order Runge-Kutta method with time step 0.02.The long-time solution after 1500,000 steps was regarded as the stationary ergodic response.100 samples are used.For every sample, the amplitude A and angle variable Γ are calculated from step 1500,001 to step 2000000 to obtain the statistical probability density of p a, γ .Figures 1 and 2 show the typical sample function of wideband noise ξ 1 t and control force u, respectively.Figure 3 shows p a, γ of the optimal controlled system 5.1 .It is seen that the theoretical result agrees very well with that from digital simulation.Figures 4 a and 4 b show the sample functions of displacement and velocity of the system 5.1 , respectively.It is obvious that the bounded control can reduce the displacement and velocity.Also, the amplitude of the original system 5.1 is reduced by control, which is verified by Figure 5.Note that the system 5.1 is strongly nonlinear.Increasing the nonlinearity coefficients β 1 or α in 5.1 , one can see that the agreement between theoretical results and Monte Carlo digital simulation is still acceptable see Figures 6 and 7 .This demonstrates that the proposed method is powerful to deal with strongly nonlinear problems, even though the nonlinearity is extremely strong.

Conclusions
In the present paper, a combination procedure of the stochastic averaging method and Bellman's dynamic programming for designing the optimal bounded control to minimize the response of strongly nonlinear systems under combined harmonic and wide-band noise excitations has been proposed.The procedure consists of applying the stochastic averaging method for weakly controlled strongly nonlinear systems under combined harmonic and wide-band noise excitations, establishing the dynamical programming equation for the control problem of minimizing the response based on the partially averaged It ô stochastic differential equations and the dynamical programming principle, determining the optimal control from the dynamical programming equation and the control constraint without solving the dynamical programming equation.Then the stationary joint probability density and mean amplitude of the optimally controlled averaged system are obtained from solving the reduced FPK equation associated with the fully averaged It ô stochastic differential equations.A nonlinearly damped Duffing oscillator with hardening stiffness has been taken as an example to illustrate the application of the proposed procedure.The comparison between the theoretical results and those from Monte Carlo simulation shows that the proposed procedure works quite well even though the nonlinearity is extremely strong.The results show that the response amplitude of the system can be reduced remarkably by the feedback control.
The advantages of the proposed method are obvious.Note that; in the example, the wide-band noise is generated by a first-order linear filter.In principle, one could apply the stochastic dynamical programming method to the extended system X, Ẋ, ξ 1 , ξ 2 T to study the optimal control problem.However, the corresponding HJB equation is 5-dimensional or 4 dimensional.The corresponding reduced FPK equation is 4 dimensional, which is very difficult to solve.After stochastic averaging, the original system is represented by two-dimensional time-homogeneous diffusion Markov processes of amplitude and phase difference with nondegenerate diffusion matrix.The dynamical programming equation derived from the averaged equations is two dimensional or one dimensional.The corresponding reduced FPK equation is two dimensional, which is easy to solve.The other advantage of the proposed procedure is that it is not necessary to solve the dynamical programming equation for obtaining the optimal control law.Furthermore, the proposed method can be extended to multi-degrees-of-freedom MDOF systems easily.This will be our future work.

Appendix
The drift coefficients in 5.8 are as follows:

Figure 4 : 1 α = 3 Figure 5 :Figure 6 :Figure 7 :
Figure 4: Sample functions of displacement and velocity of the system 5.1 , u 0 0.4.Other parameters are the same as those in Figure 3. Red line: with control; blue line: without control.