HOC Based Blind Identification of Hydroturbine Shaft Volterra System

In order to identify the quadratic Volterra system simplified from the hydroturbine shaft system, a blind identificationmethod based on the third-order cumulants and a reversely recursivemethod are proposed.The input sequence of the system under consideration is an unobservable independent identically distributed (i.i.d.), zero-mean and non-Gaussian stationary signal, and the observed signals are the superposition of the system output signal and Gaussian noise. To calculate the third-order moment of the output signal, a computer loop judgment method is put forward to determine the coefficient. When using optimization method to identify the time domain kernels, we combined the traditional optimization algorithm (direct searchmethod) with genetic algorithm (GA) and constituted the hybrid genetic algorithm (HGA). Finally, according to the prototype observation signal and the time domain kernel parameters obtained from identification, the input signal of the system can be gained recursively. To test the proposed method, three numerical experiments and engineering application have been carried out. The results show that the method is applicable to the blind identification of the hydroturbine shaft system and has strong universality; the input signal obtained by the reversely recursive method can be approximately taken as the random excitation acted on the runner of the hydroturbine shaft system.


Introduction
There are lots of nonlinear phenomena in engineering.Knowing of the coherent characteristics is of great importance to engineering safety.Thus exploring an appropriate method to describe an input-output relation defined by the nonlinear system has been a hot research topic.At present, Volterra series is a commonly used technique in researchers, and achievements are involved with plenty of engineering fields.Maheswaran and Khosa [1,2] use a Volterra series model with wavelet functions to identify the nonlinearity of a natural stream flow and to forecast the water levels in groundwater system, respectively.In forensic investigation, a filter designed with Volterra series can well work when it is necessary to identify the latent fingerprints [3].A nonlinear phenomenon that is produced by a distorted harmonic audible signal is recognized by estimating sound pressure level with Volterra series [4].In addition, the nonlinear structure response of a large container carrier in irregular seaway is calculated with a third-order Volterra model [5].For a multiple input-multiple output (MIMO) system, a topological assemblage scheme is developed to make synthesis of Volterra system and applied to engineering, such as vibration analysis of wind-excited suspended cable [6].
The applications of Volterra series are mainly demonstrated in two aspects, that is, modeling and identification of an input-output system.Two important properties of Volterra series are (1) output of a Volterra system depends linearly on kernel parameters of this system; (2) nonlinearity of a signal can be represented throughout multidimensional operators on product of samples.Therefore, Volterra model is generally used to describe the input-output relationship of a nonlinear system in a long term.Nonlinear systems, such as system with additive white Gaussian noises [7], structure subjected to impact load [8], and turbomachinery [9], can be represented by Volterra series.Systems with multispectral properties are also reported [10,11].Peng et al. make a lot of achievements on the modeling, identification, and prediction with Volterra series [12][13][14].They discussed lots of problems, for instance, the influence on the force transmissibility of MDOF structures with cubic nonlinear damping device [12], a new method to detect the breathing fatigue crack in nonlinear beam [13] and the analytical expression of calculating the bispectrum with higher order cumulants for nonlinear systems subjected to zeromean Gaussian excitation [14].These works highly prove the advantages of Volterra series in describing a nonlinear system and the convenience for derivation of the subsequent formulae.
It is fundamental to use Volterra series method to identify a nonlinear system.Take Duffing oscillator as an example, its "jump" phenomenon can be recognized by the Volterra series method [15], which is expanded to identify a bilinear oscillator [16].To deal with identification problems more effectively, Volterra series is usually combined with an optimization method [17].However, the input signal of a nonlinear system is not always known.Thus the blind identification method without the input signal is developed.Le Caillec [18] blindly identified a second-order Wiener model driven by a Gaussian noise with corrupted output data.Ghandchi Tehrani et al. [19] compared the Volterra series method with the harmonic balance method, which proves the ability of blind identification by the Volterra series method.
Actually, when Volterra series is employed to model and blindly identify a nonlinear system, it needs to often meet the situation in which higher order cumulants are used.Antari et al. [20,21] considered the third-order and fourthorder cumulants to blindly identify a Hammerstein system.However, the expression of higher cumulants sometimes seems to be quite complicated [22].For a hydroturbine main shaft system in a hydropower station, the nonlinear inputoutput relation is very complicated.Especially for the input signals, such as the hydraulic excitation on the runner, its working mechanisms are not completely known so far.On the contrary, the corresponding output signals, such as the throw at turbine guide bearing, are always easy to be obtained.
Thus it is a mathematical problem which needs to be solved frequently in hydraulic engineering to identify the whole shaft system with the observed vibration response only.
This paper proposes a novel blind identification method based on higher order cumulants for a nonlinear system described by a hydroturbine shaft Volterra model.A computer loop judgment method is used to determine the coefficients   when the third-order moment of the input signal is calculated.It greatly simplifies formula and algorithm.Finally, the Volterra kernel parameters of the nonlinear system under consideration are identified by the hybrid genetic algorithm (HGA).

Description of System
The shaft system of a certain hydroturbine generating set can be simplified as a quadratic Volterra model, as shown in Figure 1.The input sequence of the system is {()}, representing the hydraulic excitation on the runner; the output sequence {()} denotes the dynamic displacements at turbine guide bearing; the observed signal is {()}, and Gaussian noise is {()}.
The corresponding nonlinear system is described as where  stands for the discrete time, ℎ(, ) stands for the time domain kernels of the quadratic nonlinear system, and  represents the system memory length ( + 1), and ,  are the time delay.The excitation sequence {()} of the system is an unobservable, independent identically distributed (i.i.d.), zero-mean and non-Gaussian stationary signal.It is assumed that the following equations hold: where  , stands for the  th -order moment at the origin of the input signal.It is proved that, when {()} is a zero-mean sequence, the following properties [22] hold: Letting  , ( 1 , . . .,  −1 ) and  , ( 1 , . . .,  −1 ) be cumulants of the output sequence and the observed sequence, respectively, a relationship between the higher order cumulants (HOC) and time domain kernels ℎ(, ) of the quadratic nonlinear system is established by calculation.In the present study, we try to establish a relationship between the thirdorder cumulants of the observed signal and time domain kernels ℎ(, ).Because the third-order or higher order cumulants of the Gaussian noise are zero, we obtain from (2) the following: Therefore, the third-order cumulants of the output sequence {()} need to only be calculated.For a stationary random sequence {()}, the third-order cumulants are defined as where  1 ,  2 (), and  3 ( 1 ,  2 ) denote the first-order moment, second-order moment, and third-order moment of the sequence {()}, respectively.The specific expression of  1 is stated as It implies that when ,  are not equal, the expectation value is zero; only the diagonal elements ℎ(, ) are left for the time domain kernels ℎ(, ). 2 () is as follows: where   is the coefficient, which consists of the secondorder moment or the fourth-order moment of the input signal {()}, and the value depends on the different time delay of the sequence {()}.Among the four subscripts − 1 , − 2 , − 1 , and − 2 , if the value of one subscript (time) is different from the rest of the three, the expectation is zero, and   = 0.The values of   are yielded as In order to calculate the three second-order cumulants in ( 6), we substitute , respectively, by  1 ,  2 , and  2 −  1 in ( 8).The expression of the last term  3 ( 1 ,  2 ) in ( 6) is written as in which coefficient   is composed of one or several certain order moments of the input signal sequence {()}.According to the different time delay of the sequence {()}, the different values are obtained as The judgment of the first line  6 is simple, and the judgments of the rest  4  2 ,  2 3 , and  3 2 are relatively complicated since there exist a variety of different combinations, and all combinations need to compile a programming code to loop for judgment, which is called as a method on the computer loop judgment (details in the Appendix).With this method, it is possible to calculate the second-order moment and thirdorder moment of the output sequence {()} and to have a great superiority in computing.

Hybrid Genetic Algorithm
3.1.Consistent Estimate.For a known observed signal {()}, we need to have an expectation operation.Because the number of the observed signals is limited, it is difficult to precisely calculate the expectation required.So, an approximating method is used to calculate this value by consistent estimate; namely, In order to complete the consistent estimate of the third-order cumulants in (12), we replace  with  1 ,  2 , and  2 −  1 in ( 14), respectively.

Algorithm.
The hybrid genetic algorithm (HGA) is an intelligent optimization algorithm which combines the simple genetic algorithm (SGA) with the traditional optimization methods.
As we know, for a kind of optimization algorithm, the global search ability of the genetic algorithm (GA) is very strong.However, when the search process is close to a local optimal point, its local search ability starts to become poor.That is a shortcoming of the simple genetic algorithm.On the contrary, the traditional direct search method of discrete variables is lack of global search ability, although the local search ability is very strong.Therefore, the authors make a hybrid genetic algorithm by combining the traditional direct search method with the genetic algorithm.The hybrid genetic algorithm (HGA) improves the ability of the global and local searches in calculation.
The basic steps of HGA are sentenced in the process of GA.The local search is carried out in a certain probability  ℎ (for example 5%) to find out a local optimal point as soon as possible.Then it goes back to the genetic operation for the global search to complete the global search.The flow chart of HGA is as shown in Figure 2.

Blind Identification of Time Domain Kernels.
Through the above derivation, we calculate the consistent estimates of the third-order cumulants of the observed signal {()} by using ( 12) and ( 13)- (15).On the other hand, we obtain a relationship between time-domain kernels ℎ(, ) of the nonlinear system and the third-order cumulants by using ( 6) and ( 7)- (11).Thus, letting (6) equal (12), we obtain an equation as Obviously, this is a cubic nonlinear equation with unknown ℎ(, ).By solving this equation, ℎ(, ) is obtained.It is noted that because the output signal of the system is only used in calculation, namely, no input signal {()} is needed, so, this identification method is called blind identification.
In actual application, the third-order cumulants of (+1) 2 have to be computed; that is,  3 ( 1 ,  2 ), 0 ≤  1 ,  2 ≤ .Since it is not easy to directly solve a cubic nonlinear system, the above-mentioned HGA is proposed to solve (16).The objective function (the fitness value of an individual) of the optimization problem is as follows: where h is the unknown column matrix composed of the elements in time domain kernels ℎ(, ).Due to the symmetry properties of the third-order cumulants, namely,  3 ( 1 ,  2 ) =  3 ( 2 ,  1 ), the loop variable  2 in (17) increases from  1 to .
For HGA, when the Gaussian noise is zero-mean i.i.d.signal, the following constraint condition can be used: where  denotes the admissible error of the constraint condition.Comparing (7) with (15), it is seen that the introduction of the constraint (18) can guarantee the small deviation of the diagonal elements ℎ(, ) in kernels ℎ(, ) to be identified.

Numerical Experiment
To examine the proposed method, two quadratic nonlinear Volterra systems, E1 by (19) and E2 by (20), are herein considered as In (19) and (20),  is used to replace time , to denote discrete time.
Because ( 19) is of completely squared terms, the coefficients of each term are the true solutions.Therefore, the true solutions of E1 are ℎ (, ) = [1.00−0.30 −0.85 0.66] .
In (20), the coefficient of a squared term is the true solution; however, half of the coefficient of a cross term is the true solution.
The input signal  0 () is generated by RAND command in MATLAB.Their lengths are 1024 and 8192, and each has 50 groups.Then the generated uniform white noises are scaled as independent identically distributed (i.i.d.) signal with variance 1.Assuming that the memory length of the model is  =  + 1 = 4, the simulation is under the condition of 50 Monte-Carlo runs.At the same time, the different values of the signal-to-noise ratio (SNR) defined by ( 23) are considered [20]: Only the diagonal elements exist in the time domain kernels ℎ(, ) of E1; in other words, the system is a simplified Volterra Model: Hammerstein Model.The identification results are listed in Table 1 for four cases of SNR = 0, 20, 40, ∞.At the same time, to assess the universality of the method, the time domain ℎ(, ) is identified by treating it as a full matrix with element ℎ(, ).The results are listed in Table 2.The time domain kernels ℎ(, ) of E2 are originally full matrix with the elements, and the E2's results are listed in Table 3.
It is seen from Tables 1-3 that the proposed method is suitable to well identify the quadratic nonlinear system under consideration.Table 1 shows that, when SNR = 0 dB, the relative errors of the four values of the time domain kernels ℎ(, ) are basically within 10%, and the standard deviation is between 0.12 and 0.14.With the increase of the signal-to-noise ratio, the relative errors are gradually reduced to less than 2%, and the standard deviation is to 0.06-0.11; the objective function value (sum of the square of the third-order cumulants errors) is also reduced.The identification error becomes large with the small SNR; this is due to the fact that there exists certain error between Gaussian noise signal generated by the numerical method and the real Gaussian distribution, which leads a direct effect of the identification results.
In Table 2, the same sequence to the signal data as Table 1 is used.The identified mean values of the nondiagonal elements are generally under 0.02 (the real value is 0), and the standard deviation is below 0.08.This is very important; when the blind identification method is used to identify a quadratic system, we do not know the length of the memory, or whether the time domain kernels ℎ(, ) have only the diagonal elements.By using different memory length to identify an identical system, we obtained very similar results, which show that this method has strong universality.

Blind Identification of Quadratic Volterra System of Number 4 Hydroturbine Shaft in a Certain Hydropower Station
The number 4 hydroturbine generating set shaft system in a certain hydropower station, as shown in Figure 1(a), can be simplified as a second-order Volterra system (shown in Figure 1  there is a precondition here: the input must be zero-mean, i.i.d., stationary signals, and their statistics are known.
According to the 50 groups of the zero-mean, i.i.d.stationary signals generated by MATLAB, the statistics of the input signals are chosen as follows: The memory length of the system is taken as  2 =  2 + 1 = 6, so the time domain kernels parameters (optimization variables) to be identified are The system becomes an optimization problem with 21 design variables.When using HGA, we set the population as 50, crossover probability 50%, mutation probability 5%, and local search probability 5%.The lower and upper bounds of the variables are [−100, 100], and the increment of the discrete variables is 0.01.The interval of each variable is narrowed down after some test run, and the increment of the discrete variables is ultimately decreased to 0.00001.The number of the evolution generations is one million.
After identification, the time domain kernels parameters ℎ(, ) of the Volterra system for this shaft system are obtained, as shown in Figure 4.The frequency domain kernels parameters (Generalized Frequency Response Function, GFRF) can be obtained by two-dimensional Fourier Transform, as shown in Figure 5.
So far, by use of the blind identification method proposed in this paper, the blind identification of the system was successfully carried out, based on the observed throw signals at the turbine guide bearing.

Reversely Recursive Solution to
Hydraulic Excitation 6.1.Reversely Recursive Algorithm.The hydraulic excitation acted on the hydraulic turbine runner is very important in the design and calculation of shaft system in a hydroelectric generating set but difficult to be observed.A reversely recursive algorithm is proposed in the paper, which can be used to approximately find out the input signal of the system based on the following two sets of data: (1) the observed output signal  0 () of the system; (2) the system time domain kernels ℎ(, ) obtained through blind identification.
Expanding the second-order discretized time domain Volterra system (1), the specific expression of () can be obtained.When  = 0, from the following expression: We can get where  0 (0) is the observed output signal of the system.(0) thus obtained has two roots.When  = 1,  (1) = ℎ (0, 0)  ( 1)  (1) + ℎ (0, 1)  ( 1 Because (0) has two roots, substituting, respectively, the two roots into (31), we can obtain the four roots of (1).The question now is which roots should (0) and  (1) take?The answer is obvious.First, calculate (0) and (1) through ( 26) and ( 28 Because ( − 1) has two values, we can obtain the four roots of ().In order to determine ( − 1) and (), we still use optimization method.The objective function for optimization is Finding the minimum value  min , the corresponding ( − 1) and () can be taken as the input signal.Thus through recursive calculation, we can find out (0)∼(); here  + 1 is the sampling length.The above method is the reversely recursive algorithm.

Numerical Experiment E2.
For convenience, we still use the experimental data E2 in Section 4. According to the above method, the calculated input data () can be obtained, as shown in Figure 6, where () is the calculated result and 0() is the original input signal.
To illustrate the correctness and feasibility of the calculation results, substituting in (1) the calculated results () and the original input signal 0(), respectively, we can find out the output signal () and the original output signal 0() and their differences: and draw in Figure 7.At the same time, the relative error of output signal energy is calculated by using the following formula: From Figures 6 and 7 we can see that, by the reversely recursive method proposed in this paper, we can approximately find out the input signal of the system based on the system output signal.The calculated input signal () is slightly 100 200 300 400 500 600 700 800 900 1000   different from the original input signal 0(), which is mainly caused by the quadratic nonlinearity of the system.However, the errors () between the system output signal () calculated through the input signal () and the original output signal 0() are smaller (Figure 7).Only a few signal values have errors, and most output signal values are the same as the original value.The relative error of the output signal energy by (37) is only 3.96%.Therefore, the input data () thus calculated can be approximately used as the system input signal.

Reversely Recursive Solution to Hydraulic Excitation on
Turbine.The hydraulic excitation (input signal) () on the turbine runner and its distribution can be calculated recursively using the proposed method in Section 6.1, as shown in Figure 8.
Since there is no real input signals that can be compared, we cannot tell the difference between the input signal () obtained with recursive method and the real input signals.However, we can compare the calculated output signal () with the real output signal  0 ().Figure 9 shows the calculated output signal () and the absolute errors with observed output signal  0 ().From the figure we can see that the error is not very big.Besides, the relative error of the output signal energy was calculated, whose value is 8.95%.This shows that the method is feasible; that is to say, the input signal () is acceptable and can be approximately taken as the hydraulic excitation on the turbine runner of a hydroelectric generating set shaft system.

Conclusions
A blind identification method based on the third-order cumulants is proposed for a quadratic Volterra nonlinear system.Through three numerical experiments and engineering application, the following conclusions are drawn.(1) Two different types of the memory length are used to identify the same sets of the signal data sequence of hydroturbine main shaft, and the error between the two groups of the time domain kernels ℎ(, ) is small.It shows that the proposed method is applicable to the blind identification of the hydroturbine main shaft system and has strong universality.
(2) The proposed computer loop judgment method to determine the coefficient   for calculating the thirdorder moment of the output signal greatly simplifies the formulae derivation and compiling of the programming code and is suitable for calculating the fourth-or higher order cumulants.
(3) With the gradual increase of the signal-to-noise ratio, the relative error of the identified time domain kernels decreases, the standard deviation is reduced, and the objective function value is also reduced.
(4) According to the observed signal of the shaft system of a certain hydroturbine generating set, the time domain kernels of the Volterra system can be blindly identified by the method proposed in this paper.Furthermore, according to these time domain kernels and the observed signal, the input signal of the system can be recursively calculated.The results show that the computed input signal can be approximately taken as the stochastic excitation acted on the turbine runner, and the error is not big (relative error of the output signal energy is acceptable).The proposed method is of enlightening significance for the form of hydraulic action.

Figure 2 :
Figure 2: Flow chart of hybrid GA.
Hence the true values of E2 are ℎ (, ) (b)).The observed data of the lateral throw  2 () at the turbine guide bearing in the experiment is shown in Figure3(after filtration).These observed signals are taken as the output signals (), and this engineering problem is identified according to the above proposed method.Note that, for the problem in which the input signals are unknown,

Figure 3 :
Figure 3: Transient waveshape of the lateral throw at turbine guide bearing (filtered).

Figure 4 :
Figure 4: The second-order time domain kernels of the shaft system in overspeed test.

Figure 5 :
Figure 5: The frequency domain kernels of the shaft system in overspeed test.

Figure 8 :
Figure 8: The input signals calculated by recursive method and their histogram.

Figure 9 :
Figure 9: The output signals () calculated by () and their errors.