An Improved Global Harmony Search Algorithm for the Identification of Nonlinear Discrete-Time Systems Based on Volterra Filter Modeling

This paper describes an improved global harmony search (IGHS) algorithm for identifying the nonlinear discrete-time systems based on second-order Volterra model. The IGHS is an improved version of the novel global harmony search (NGHS) algorithm, and it makes two significant improvements on the NGHS. First, the genetic mutation operation is modified by combining normal distribution andCauchy distribution, which enables the IGHS to fully explore and exploit the solution space. Second, an oppositionbased learning (OBL) is introduced and modified to improve the quality of harmony vectors. The IGHS algorithm is implemented on two numerical examples, and they are nonlinear discrete-time rational system and the real heat exchanger, respectively. The results of the IGHS are compared with those of the other three methods, and it has been verified to be more effective than the other three methods on solving the above two problems with different input signals and system memory sizes.


Introduction
The Volterra model is a kind of nonlinear filter model, which is usually employed to track and identify plenty of complex nonlinear systems.In order to enhance the quality of the system identification, it is very crucial issue to select optimum model coefficient called the kernel.Therefore, the Volterra model is essentially an extension of linear filter model to nonlinear case.During the past decade, there have been many research works on the Volterra model.Campello et al. [1] tackled the problem of expanding Volterra models using Laguerre functions.The global optimal solution is obtained when each multidimensional kernel of the model is decomposed into a number of independent orthonormal bases.Furthermore, the solution obtained is able to minimize the upper bound of the squared norm of the error resulting from the practical truncation of the Laguerre series expansion into a finite number of functions.Masugi and Takuma [2] described a Volterra system-based nonlinear study of videopacket transmission over IP networks.Based on the Volterra system, the authors performed a time-series analysis of measured data for network response evaluation.The novel method can reproduce the time-series responses observed in video-packet transmission over the Internet, characterizing nonlinear dynamic behaviors such that the obtained results gave an appropriate depiction of network conditions at different times.Gruber et al. [3] presented a nonlinear model predictive control (NMPC) method based on a secondorder Volterra series model for greenhouse temperature control using natural ventilation.These models, denoting the simple and logical extension of convolution models, are capable of describing the nonlinear dynamic feature of the ventilation and other environmental conditions on the greenhouse temperature.Many applications of Volterra series modeling were executed in the frequency-domain based on the Generalized Frequency Response Functions (GFRF) [4,5].In the light of the wide applications of Volterra series, Li and Billings [6] presented an approach to estimate the GFRF in a piecewise manner for duffing type oscillators for the underrepresented weakly nonlinear region.Additionally, 2 Mathematical Problems in Engineering they devised a new method to obtain the energy contributions of each order kernel.The new nonparametric method can not only construct the amplitude-invariant GFRF over a certain excitation range, but also avoid building large sets of timedomain models.In addition, the Volterra filter model can be found in some other application areas [7][8][9][10][11][12][13][14].
Chang [7] devised an improved particle swarm optimization (IPSO) to implement system identification based on Volterra filter model.In this paper, we develop an improved global harmony search (IGHS) algorithm and try the IGHS as an efficient candidate for system identification based on Volterra filter model.The harmony search (HS) algorithm was firstly proposed by [15].The HS is a simple but efficient algorithm, and its many improved versions have been applied into many problems including reliability problems [16], reactor core fuel management optimization [17], and sizing optimization of truss structures [18].
The paper is organized as follows.In Section 2, the pruned second-order Volterra filter model is simply presented.In Section 3, a novel global harmony search algorithm is introduced.In Section 4, an improved global harmony search algorithm is proposed, and its procedure is fully explained.In Section 5, four harmony search algorithms are used for two examples with different signal inputs and memory sizes.We end this paper with some conclusions in Section 6.

Volterra Filter Model and Its Pruned Second-Order Form
The Volterra filter model is an efficient method for the identification of nonlinear discrete systems, and it has come into researchers' notice in recent decades.The discrete form of Volterra filter model of the th order [7] is given by where  denotes the system memory size.Equation (1) denotes the Volterra filter model with the infinite series.However, this model is hard to compute and master due to its complex and expatiatory formula.In this paper, we only study its simplified and approximate form called the truncated second-order Volterra model [7,19], which is stated as follows: In order to facilitate expression, (2) can also be expressed as the following vector form: Here, the superscript  represents the transpose of a vector and  stands for the Volterra kernel vector given by In addition,  denotes the Volterra input vector given by In the light of (3), the vector lengths of both  and  are the same and are calculated as follows [19]: To achieve the nearest approximation of the actual system output, appropriate kernel vector  should be determined under the input vector .In this paper, an improved global harmony search (IGHS) algorithm is proposed to determine kernel vector .The IGHS is an improved version of novel global harmony search (NGHS) algorithm [20]; thus, both the NGHS and the IGHS will be presented in the following sections.

Novel Global Harmony Search (NGHS) Algorithm
Novel global harmony search (NGHS) algorithm [20] is a variant of harmony search (HS) algorithm [15], and it is superior to the HS for solving unconstrained optimization problems.The NGHS improvises new harmony vectors by combining position updating and mutation.Concretely, the steps of NGHS are explained as follows.
Step 3 (improvise a new harmony).Improvisation is actually the operation of producing a new harmony vector.For the NGHS, its improvisation mainly includes two steps, and they are position updating and genetic mutation, respectively.More specifically, the improvisation can be presented in Table 1.
Here, "best" and "worst" stand for the indexes of the best harmony and the worst harmony in HM, respectively. 1 ,  2 , and  3 denote three uniformly generated random numbers in [0, 1].With respect to position updating, a new harmony vector is improvised near the best harmony vector, which can facilitate the convergence rate of the NGHS.On the other hand, it is worth noticing that genetic mutation is an event of small probability, and it is utilized to avoid the premature convergence of the NGHS.
Step 4 (update harmony memory).Replace the worst harmony vector  worst of HM with the new improvised harmony vector   no matter whether   is better or worse than  worst .
Step 5 (check the stopping criterion).Steps 3 and 4 are repeated until the number of improvisations (NI) is reached.

An Improved Global Harmony Search (IGHS) Algorithm
In this paper, an improved global harmony search (IGHS) algorithm is proposed to implement the system identification based on Volterra filter model.The IGHS is an improved EndIf (12) EndFor version of the NGHS, and it is different from the NGHS in the following two aspects.
(1) The Modification of Genetic Mutation.The genetic mutation of the NGHS is conducted according to uniform distribution.To further explore and exploit the solution space, the uniform distribution is replaced with the other two probability distributions, and they are normal distribution [21][22][23] and Cauchy distribution [22,23], respectively.Therefore, the new genetic mutation are implemented by using the following two equations: Here,  denotes the index of the th component of the improvised harmony vector, and  best  denotes the th component of the best harmony vector in HM. rand   ( best  , 0.1) stands for the normal distribution with mean  best  and standard deviation 0.1, and rand   ( best  , 0.1) stands for the Cauchy distribution with location parameter  best  and scale parameter 0.1.If overflow happens to    generated using either normal distribution or Cauchy distribution, it will be truncated to [  ,   ].On the other hand, if the condition  2 <   (as in Table 1) is satisfied, either normal distribution or Cauchy distribution will be utilized to carry out genetic mutation, and the probability of using any one of the two distributions is equal to 0.5.By adopting normal distribution and Cauchy distribution, the capacity of escaping from the local optimums is enhanced for the IGHS.In the meantime, the solution space can be fully explored and exploited, which is beneficial for improving the quality of the improvised harmony vector   .
In addition to the equation of genetic mutation, we also modify the mutation rate (  ) of the NGHS.The parameter   is used to determine whether or not a variable of   adopts genetic mutation.A large   value is beneficial for searching in a large scope, while a small   value is helpful to search Mathematical Problems in Engineering in a small scope.To well balance the global search and local search of the IGHS, the   value decreases dynamically with increasing generations (NI) as follows: where   and   represent the minimal and maximal mutation rates. (1 ≤  ≤ NI) denotes the current generation number, and the IGHS is stopped when the current generation number  reaches the maximal generation number NI.In addition, NI 0 denotes a fixed generation number, and it is set to 3NI/4 here.
(2) Introduce and Modify an Opposition-Based Learning (OBL) [24].In order to improve the convergence of the IGHS, a method called opposition-based learning (OBL) [24] is firstly introduced.In short, this technique can be stated as follows: Here,   is the opposite solution of   , and    is the th variable of   .Furthermore,    and    represent the minimal and the maximal values in the set { 1   ,  2  , . . .,  HMS  }, and they are expressed as follows: According to (10) where rand() denotes a random number uniformly generated in [0, 1].By using (12), the IGHS can yield numerous possible values in the range of [   ,    ].Compared to the OBL technique, its modified version has wider searching space, which is in favor of fine-tuning of the best harmony vectors.
Based on ( 10)- (11), a new harmony memory HM  can be generated, and it is given by Here,   = (  1 ,   2 , . . .,    ) stands for the th ( = 1, . . ., HMS) harmony vector in HM  , and    denotes its th ( = 1, . . ., ) variable.After performing the modified OBL operation, each   is compared with the corresponding harmony vector   .More precisely, if   is better than   ,   should be replaced with   .It is worth noticing that the modified OBL is an event of small probability.In other words, this technique is an auxiliary step of the IGHS improvisation, and it is mainly used to improve the quality of the improvised harmony vector.
Based on the above detailed illustrations about the two modifications of the NGHS improvisation, the IGHS improvisation can be summarized in Table 2.
By utilizing the IGHS, the identification diagram of nonlinear discrete-time systems based on the pruned secondorder Volterra model is shown in Figure 1.Besides, the nomenclatures appearing in this diagram are explained in Table 3. Table 3: Nomenclatures used in Volterra filter modeling of nonlinear discrete-time system.

𝑦[𝑛]
The output signal of the second-order Volterra filter model (as (2))

𝑑[𝑛]
The output of the unknown nonlinear discrete-time system [] The measurement noise ŷ[] The sum of [] and []  [𝑛] The error signal between ŷ[] and  [𝑛] The goal of the IGHS is to find the optimal kernel vector  for Volterra filter model so that the difference between the estimated output [] and the actual system output ŷ[] is minimized.Thus, it is advisable to find an objective function to satisfy the design requirement.Here we adopt the following objective function [7]: Here, [ 2 []] stands for the mean square error (MSE), and  denotes the sampling number.By using the IGHS to optimize MSE, we can obtain the most appropriate kernel vector for the second-order Volterra model.

Experimental Results and Analysis
To verify the validity of the IGHS on identifying nonlinear system based on second-order Volterra filter model, two examples including the highly nonlinear discrete-time rational system and the real heat exchanger are considered.More specifically, these two examples are explained as follows.

Example 1.
The first example is the highly nonlinear discrete-time rational system [7], and its mathematical model is stated as follows: Here, two types of input signals [7] are considered.The first is defined as Example 1a whose input is a random signal uniformly produced in [−1, 1].The second is defined as Example 1b whose input is a cosine signal [] = 0.8cos((/9)).For both Examples 1a and 1b, the measurement noise [] is always supposed to be a Gaussian noise of (0, 0.001).
In this experiment, the IGHS is used for Example 1a with  = 5.Moreover, the IGHS parameters are set as follows: the maximal mutation rate   = 0.1, the minimal mutation rate   = 0.1, harmony memory size HMS = 5, the number of improvisations NI = 2000, and sampling number  = 100.Moreover,  is actually the total number of output samples (actual system output or estimated output).According to (14), there are  sampling values for actual system output ŷ[] (1 ≤  ≤ ).After implementing the IGHS, we can obtain the comparison between the estimated output [] and the actual output ŷ[] in Figure 2.
From Figure 2, it can be seen that the estimated output [] approximates the actual output ŷ[] well.In addition, the minimal mean square error (MSE) obtained by the IGHS is equal to 5.4594−003, which is a satisfactory result.Besides, we use Example 1a with  = 8 to investigate the performance of the IGHS for second-order Volterra filter model with large memory size.The parameters of the IGHS are the same as those used for Example 1a ( = 5) except NI, and it is set to 5000 in this experiment.By performing the IGHS, we can obtain the comparison between the estimated output [] and the actual output ŷ[] in Figure 3.
It is clear from Figure 3 that a satisfactory approximation can be attained by utilizing the IGHS.Additionally, the minimal MSE yielded by the IGHS is equal to 3.7074 − 003 for Example 1a with  = 8, which provides better modeling capacity.
In addition to random signal, another testing input signal [] = 0.8cos((/9)) is used to investigate the performance of second-order Volterra filter model using the IGHS.For Example 1b with  = 5 and  = 8, the IGHS parameters are the same as those for Example 1a, and Figures 4 and  5 display the comparisons of results for Example 1b with  = 5 and  = 8.As expected, the difference between the estimated output [] and the actual output ŷ[] is small in each case.Additionally, two satisfying MSEs can be obtained, respectively, for  = 5 and  = 8 by carrying out the IGHS, and they are equal to 1.71809876 − 003 and 1.7119 − 003, respectively.

Example 2.
The second example is the real heat exchanger and its mathematical model [7,25] is given by where [] is the process input denoting the flow rate and is constrained by the range of [0, 1], [] stands for the static nonlinearity, and [] stands for the process output temperature.By combining ( 16) and (17), the simplified mathematical model can be expressed as the following difference equation: Here, two types of input signals [7,25] are considered.The first is defined as Example 2a whose input is randomly generated in the range [0.1, 0.9].The second is defined as Example 2b whose input is a sine signal [] = 0.4sin((/6)) + 0.5.According to [7,25], the measurement noise is excluded from the problem model; thus, we have [] = 0.
In this experiment, the memory size  is set to 8 for second-order Volterra filter model.To fulfill the physical input requirement, the modeling input [] is randomly generated in the range [0.1, 0.9] and the testing input is then set to [] = 0.4sin((/6)) + 0.5.Moreover, the IGHS parameters used for Example 2 are the same as those of Example 1.By implementing the IGHS, the outputs of second-order Volterra model are shown in Figures 6 and 7, respectively, for the modeling input and testing input.Based on the careful observations on experimental results, we can confirm that the satisfactory approximation is attained in each case.

Comparison of the IGHS with the Other Three HSs.
In this paper, four methods consisting of the HS [15], IHS [26], NGHS [20], and IGHS are used to solve the above problems with different input signals and system memory sizes.The parameters of these four methods are set as follows.For the HS, harmony memory considering rate HMCR = 0.95, pitch adjusting rate PAR = 0.3, and bandwidth bw = 0.01.For the IHS, HMCR = 0.95, the minimal pitch adjusting rate PAR min = 0.35, the maximal pitch adjusting rate PAR max = 0.99, the minimal bandwidth bw min = 10 −5 , and the maximal bandwidth bw max = 0.1.For the NGHS, mutation rate   = 5 × 10 −2 .For the IGHS, the minimal mutation rate   = 0.01, the maximal mutation rate   = 0.1, and NI 0 = 3NI/4.Note that all the methods are executed under identical conditions (harmony memory size and the number of improvisations).More specifically, the harmony memory size (HMS) of each method is set to 5, and the NI (Number of improvisations) value of each method is set to 2000 and 5000, respectively, for the system memory sizes  = 5 and  = 8.With respect to second-order Volterra filter model, sampling number is set to  = 100, and the range of each variable of the kernel vector is set as [−1, 1].Matlab 7.0 is used to execute the above procedure under the environment of Intel(R) Core(TM) i5-2410M CPU @ 2.30 GHz.20 independent runs are performed in each case, and the optimization results are presented in Table 4.
Table 4 gives the comparison of the results obtained by the IGHS, against the other three methods including the HS [15], the IHS [26], and the NGHS [20], and the best performance is reported in boldface.The terms "ACT" and "Std" stand for average computation time and standard deviation, respectively.According to the results, we can see that the IGHS performs better most of the time.To be precise, the values of Worst, Mean, and Std obtained by the IGHS are smaller than those obtained by the other three methods for all problems.Moreover, the IGHS can obtain five best objective function values except Example 1b ( = 8).Therefore, the IGHS has exhibited stronger convergence and stability than the other three HSs.For Example 1b ( = 8), the IHS, NGHS, and IGHS can yield the same best objective function value which is equal to 1.7119 − 003, and the remaining results obtained by the IGHS are better than the other three HSs.
Additionally, Mann-Whitney  test [27,28], also known as "Mann-Whitney Wilcoxon test," is used to ensure a statistical significant difference between the IGHS and any of the other three HSs.In other words, the Mann-Whitney  test acts as an efficient nonparametric rank-based test to identify a difference between populations.Moreover, the test statistic  is expressed as follows:  20), the sum of  1 and  2 is calculated as follows: Based on the known conditions  1 +  2 = ( + 1)/2 and  =  1 +  2 , we can easily obtain the simplified form of (21) as follows: To make the parameters  1 and  2 easy to understand, one simple and interesting example is provided as follows: a sample of 6 tortoises (sample 1) and 6 hares (sample 2) are collected and made run in a race.The order in which they reach the finishing post (their rank order, from first to last) is as follows: T H H H H H T T T T T H, where T represents a tortoise and H represents a hare.
A direct way: we take each tortoise in turn and count the number of hares it is beaten by (lower rank), getting 0, 5, 5, 5, 5, and 5, which means  1 = 0 + 5 + 5 + 5 + 5 + 5 = 25.In the meantime, we could take each hare in turn and count the number of tortoises it is beaten by.In this situation, we get 1, 1, 1, 1, 1, and 6.So  2 = 6 + 1 + 1 + 1 + 1 + 1 = 11.It is clear that the sum of these two values for  is 36, which is 6 × 6.In order to compare the IGHS with the other three HSs in a statistical way, three groups of Mann-Whitney  tests are executed, and they are ( HS ,  IGHS ), ( IHS ,  IGHS ), and ( NGHS ,  IGHS ), respectively.For each problem, 20 independent experiments are carried out; therefore,  other =  IGHS = 20 and  other +  IGHS =  other ×  IGHS = 400, where the subscript other denotes any of the other three HSs including the HS, IHS, and NGHS.The parameters of the four HSs are exactly the same as those of the aforementioned section, and Table 5 lists the results of Mann-Whitney  tests.
From Table 5, it is evident that the IGHS completely dominates the HS for solving all problems, because the values of  HS are all equal to 400.In other words, the IGHS has never been beaten by the HS.Furthermore, the values of  NGHS are significantly greater than 200, which indicates that the IGHS is also superior to the NGHS.In addition, the values of  IHS are significantly greater than 200 for all problems except Example 1b ( = 5).Thus, the IGHS performs better than the IHS on solving five of the six problems.Regarding Example 1b ( = 5), the  IHS value is equal to 177, which is smaller than but close to 200.Thus, the IHS is a little better than the IGHS for Example 1b ( = 5).
Figure 8 shows the average convergence curves of four methods over 20 runs for six problems.It is clear that the HS has the poorest performance, and its convergence rate is the slowest.In addition, both the IHS and the NGHS converge faster than the HS but slower than the IGHS.Obviously, the IGHS has the fastest convergence rate in each case.For Example 1a ( = 5) and ( = 8), it can achieve lower levels than the other three HSs.With respect to Example 2b ( = 8), the NGHS and the IGHS can converge to comparable levels, which are lower than those of the HS and the IHS.Moreover, the IHS, the NGHS, and the IGHS can reach comparable levels for the remaining cases.Strictly, the IGHS still outperforms the other three HSs for these cases according to Table 4.
In Example 1, the standard deviation of the Gaussian noise is so small ( = 0.001) as it is equivalent to considering a negligible noise, while in Example 2 the simulations are carried out without noise.To investigate the robustness of the proposed algorithm with respect to the additive noise, different standard deviations () of the Gaussian noise are considered when solving a problem of system identification.On the other hand, signal to noise ratio (SNR) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise.In this paper, the amplitudes of signal and noise are measured, and thus the SNR value can be calculated by  SNR dB = 20 log 10 ( signal / noise ), where  represents root mean square (RMS) amplitude.Given a standard deviation for a problem, a corresponding SNR value can be determined.Therefore, the SNR values related to various standard deviations are reported in Table 6.
From Table 6, it is clear that the SNR value decreases as the standard deviation of the Gaussian noise increases.Therefore, a Gaussian noise with larger  value will exert more serious effects on signal than the one with smaller  value, which is harmful to the extraction of signal.In addition, a plot of MSE versus SNR is given in Figure 9.
In Moreover, the MSE values tend to get larger when SNR decreases in most cases.In this experiment, the standard deviation  varies in a wide range [0.001, 0.2], the MSE values remain low levels for most  values, and the large fluctuations of MSE do not happen until  increases to very high levels (such as  = 0.1, 0.2).Therefore, the robustness of IGHS with respect to the additive noise is acceptable to some extent.
Various standard deviations of the Gaussian noise will have different effects on the convergence of HS, IHS, NGHS, and IGHS for six problems.To testify and compare the convergence of the proposed algorithm and the other three HSs, the average MSEs associated with the standard deviation are listed in Table 7.
The best results are marked in bold in Table 7.For any problem with various standard deviations, IGHS can always obtain the smallest MSEs among four algorithms, indicating that IGHS has stronger convergence and stability than the other three HSs.Overall, IGHS has exhibited more desirable robustness to the Gaussian noise than the other three HSs for all six problems.

Conclusions
In the present paper, we propose an improved NGHS algorithm, called IGHS, for identifying nonlinear discrete-time systems based on second-order Volterra model.The prime objective of the IGHS is to attain the most appropriate kernels of Volterra model so that the estimated output can track and characterize the actual output as much as possible.Furthermore, we use two examples with different input signals and system memory sizes to test the performance of the IGHS.Experimental results suggest that the IGHS performs well and is superior to the other three methods in most cases according to four criteria ("Best," "Worst," "Mean," and "Std") associated with the mean square error (MSE).Thus, the IGHS is a strong candidate for the identification of nonlinear system based on second-order Volterra model.

Figure 1 :
Figure 1: Volterra filter modeling of nonlinear discrete-time system using the IGHS algorithm.

Figure 2 :
Figure 2: Comparison of actual system output and Volterra model output for Example 1a ( = 5).

Figure 3 :
Figure 3: Comparison of actual system output and Volterra model output for Example 1a ( = 8).

Figure 9 :
Figure 9: The mean square error (MSE) versus the signal to noise ratio (SNR).
Figures 9(a), 9(b), 9(c), and 9(d), the fluctuations of MSE are minor in the range 20 ≤ SNR dB ≤ 55.In Figure 9(f), the fluctuations of MSE are minor in the range 40 ≤ SNR dB ≤ 80.In Figure 9(e), the fluctuations of MSE are slightly larger than those of the above five examples.
In addition, the problem parameters include the number of problem variables , the lower bound   , and upper bound   of the th ( = 1, 2, . . ., ) problem variable.Furthermore, the number of improvisations (NI) is actually the total number of generations for adjusting the parameters related to the identification problem.In each generation, only one new candidate solution including  parameters is generated, and this solution is accepted if and only if it is better than the worst one of the previous solutions.
The NGHS parameters consist of harmony memory size HMS, the number of improvisations NI, and mutation rate   .

Table 1 :
The computational procedure of the NGHS improvisation.

Table 2 :
The computational procedure of the IGHS improvisation.
(19)e  1 and  2 denote the sizes of sample 1 and sample 2, respectively.1 and  2 denote the sums of the ranks in sample 1 and sample 2, respectively,  1 represents the number of sample 1 observations beaten by sample 2 observations, and  2 represents the number of sample 2 observations beaten by sample 1 observations.By combining(19)and (

Table 5 :
Mann-Whitney  test result obtained using four methods.

Table 6 :
The SNR values (in dB form) corresponding to various standard deviations.

Table 7 :
Effects of various standard deviations on the average MSEs of HS, IHS, NGHS, and IGHS.