Quasilinear autoregressive with exogenous inputs (QuasiARX) models have received considerable attention for their usefulness in nonlinear system identification and control. In this paper, identification methods of quasiARX type models are reviewed and categorized in three main groups, and a twostep learning approach is proposed as an extension of the parameterclassified methods to identify the quasiARX radial basis function network (RBFN) model. Firstly, a clustering method is utilized to provide statistical properties of the dataset for determining the parameters nonlinear to the model, which are interpreted meaningfully in the sense of interpolation parameters of a local linear model. Secondly, support vector regression is used to estimate the parameters linear to the model; meanwhile, an explicit kernel mapping is given in terms of the nonlinear parameter identification procedure, in which the model is transformed from the nonlinearinnature to the linearinparameter. Numerical and real cases are carried out finally to demonstrate the effectiveness and generalization ability of the proposed method.
National Natural Science Foundation of China8132010801831570943Jiangsu Province of China2015DZXX0031. Introduction
Many realworld systems exhibit complex nonlinear characteristics and hence cannot be identified directly by linear methods. In the last two decades, nonlinear models such as neural networks (NNs), radial basis function networks (RBFNs), neurofuzzy networks (NFNs), and multiagent networks have received considerable research attention for nonlinear system identification [1–4]. However, from a user’s point of view, the conventional nonlinear blackbox models have been criticized mostly for not being userfriendly: (1) they neglect some good properties of the successful linear blackbox modeling, such as the linear structure and simplicity [5, 6]; (2) an easytouse model is to interpret properties of nonlinear dynamics rather than being treated as vehicles for adjusting fit to the data [7]. Therefore, careful modeling is needed for a model structure favorable to certain applications.
To obtain the nonlinear models favorable to applications, a quasilinear autoregressive with exogenous inputs (quasiARX) modeling scheme has been proposed with two parts included: a macropart and a corepart [14]. As shown in Figure 1, the macropart is a userfriendly interface favorable to specific applications, and the corepart is used to represent the complicated coefficients of the macropart. To this end, by using Taylor expansion or other mathematical transformation techniques, a class of ARXlike interfaces is constructed as macroparts, in which useful properties of linear models can be introduced, while their coefficients are represented by some nonlinear models such as RBFNs. In this way, a quasiARX predictor linear with input variable u(t) can be further designed, where u(t) in the corepart is replaced skillfully by an extra variable. Thereafter, a nonlinear controller can be generated directly from the quasiARX predictor, which is similar to the simple linear control method [15, 16]. In contrast, complex nonlinear controller design should be considered in NN based control methods, where two independent NNs are often contained: the one used for predictor and the other used for controller [17].
QuasiARX modeling. Basic idea of the quasiARX modeling is shown in (a), where a macropart and a corepart are included in the constructed model. An example is illustrated in (b). An ARXlike linear structure works as macropart for specific application, whose coefficients are parameterized by a flexible RBFN model.
Actually, similar blocktype models have been extensively studied and named in several forms according to their features, such as the statedependent parameter models [18–20] and local linear models [10, 21]. Basically, identification methods can be categorized into three schemes:
Hierarchical identification scheme: quasiARX model structure can be considered as an “ARX submodel + NN” when NNs are utilized in the corepart [15, 16], and a hierarchical method has been proposed to identify the ARX submodel and the NN by a dualloop scheme, where parameters in the ARX submodel are fixed and treated as constants in one loop, with the NN trained by a back propagation (BP) algorithm (only a small number of epochs are implemented); then the resultant NN is fixed to estimate the parameters of the ARX submodel in another loop. The two loops are executed alternatively to achieve a great approximation ability for nonlinear systems.
Parameterclassified identification scheme: when the nonlinear basis function models are embedded in the corepart of the quasiARX models, all the parameters can be classified as nonlinear (e.g., the center and width parameters in the embedded RBFNs) and linear (e.g., the linear weights in the embedded RBFNs) to the model. A structured nonlinear parameter optimization method (SNPOM) has been presented in [9] to optimize both the nonlinear and the linear parameters simultaneously for a RBFtype statedependent parameter model, and improvement has been further given in [19, 22]. On the other hand, by using heuristic prior knowledge, the authors in [14, 23] estimate the nonlinear parameters of a quasiARX NFN model, and the least square algorithm is used to estimate the linear parameters. Similarly, a prior knowledge has been used for nonlinear parameters in a quasiARX wavelet network (WN) model, where identification can be explained in an integrated approach [24, 25].
Global identification scheme: in this category, all the parameters in the quasiARX models are optimized regardless of the parameter features and model structure. For instance, a hybrid algorithm of particle swarm optimization (PSO) with diversity learning and gradient descent method has been proposed in [10] to identify the WNtype quasiARX model, which is always used in time series prediction. Moreover, NN [26] and support vector regression (SVR) [13] are applied, respectively, to identify all the quasiARX model parameters.
In this paper, specific efforts are made to extend the second identification scheme based on classifying the model parameters. Compared with the other schemes, this one explores the model properties deeply and provides a promising solution to a wide range of basis function embedded quasiARX models. It is known that SNPOM is an efficient optimization method fallen into this category, which makes good use of the model parameters feature and gives impressive performance in time series prediction and nonlinear control. However, this technique is still considered as a “nontransparent” approach since it is aimed at datafitting only, and model parameters are difficult to be interpreted along with physical explanation of real world or nonlinear dynamics of systems [7]. Therefore, it may constrain further development of the model. In contrast, a prior knowledge based nonlinear parameter estimation makes sense to interpret system properties meaningfully, especially with respect to the quasiARX RBFN model as discussed later in Section 3. The useful prior knowledge can evolve a quasiARX model from a “blackbox” tool into a “semianalytical” one [27], which makes some parameters interpretable by our intuition, just following the principle of application favorable in quasiARX modeling. Owing to this fact, nonlinear parameters are determined in terms of prior interpretable knowledge, and linear parameters are adjusted to fit the data. It may contribute to low computational cost and high generalization of the model as parallel computation. Nevertheless, the problem is how to generate useful prior knowledge for an accurate nonlinear parameter estimation.
In the current study, a twostep approach is proposed to identify the quasiARX RBFN model for the nonlinear systems. Firstly, a clustering method is applied to generate the data distribution information for the system, whereby center parameters of the embedded RBFN are determined as cluster centers, and the width parameter of each RBF is set in terms of distance from other nearby centers. Then, it is straightforward to utilize the linear SVR for linear parameter estimation. The main purpose of this work is to provide an interpretable identification approach for the quasiARX models, which can be regarded as complementary to the identification procedures [6, 9, 13]. Compared with the heuristic prior knowledge used in quasiARX NFN model identification, the clustering based method gives an alternative approach to prior knowledge for nonlinear parameter estimation, and the quasiARX RBFN model is interpreted as a local linear model with interpolation. Moreover, when linear SVR is applied for linear parameter estimation, identification of the quasiARX RBFN model can be treated as an SVR with novel kernel mapping and associated feature space, and the kernel mapping is equivalent to the nonlinear parameter estimation procedure, which is transformed from a nonlinearinnature model to the linearinparameter one. Unlike the SVRbased method [9], the kernel function proposed in this study takes an explicit mapping, which is effective in coping with potential overfitting for some complex and noisy learning tasks [28]. Finally, in the proposed method, nonlinear parameters are estimated directly based on the prior knowledge; to some extent, it can be considered as an algorithmic approach for initialization of SNPOM.
The remainder of the paper is organized as follows. Section 2 introduces a quasiARX RBFN modeling scheme. Section 3 proposes the identification method of the quasiARX RBFN model. Section 4 investigates two numerical examples and a real case. Finally, some discussions and conclusions are made in Section 5.
2. QuasiARX RBFN Modeling
Let us consider a singleinputsingleoutput (SISO) nonlinear timeinvariant system whose inputoutput dynamics is described as (1)yt=gφt+et,where φ(t)=yt1,…,ytny,ut1,…,utnuT; u(t)∈R, y(t)∈R, and e(t)∈R are the system input, output, and a stochastic noise of zeromean at time t, respectively; nu and ny are the unknown maximum delays of the input and output, respectively. φ(t)∈Rn with n=ny+nu is the regression vector composed of the delayed inputoutput data. g(·) is an unknown function (blackbox) describing the dynamics of system under study, which is assumed to be continuously differentiable and satisfies g(0)=0.
Performing the Taylor expansion to g(φ(t)) at φ(t)=0, one has(2)yt=et+g0+g′0Tφt+12φTtg′′0φt+⋯.Then (1) is reformalized with an ARXlike linear structure: (3)yt=φTtθφt+et,where (4)θφt=g′0+12g′′0φt+⋯=a1,t⋯any,tb0,t⋯bnu1,tT.In (4), coefficients ai,t=ai(φ(t)) and bj,t=bj(φ(t)) are nonlinear functions of φ(t) for i=1,2,…,ny and j=0,1,…,nu1; thus it can be represented by RBFN as (5)θφt=Ω0+∑j=1MΩjNpj,φt,where pj includes the center parameter vector μj and the width parameter σj of the jth RBF N(pj,φ(t)), M denotes the number of basis functions utilized, and Ωj=[ω1j,…,ωnj]T is a connection matrix between the input variables and the associated basis functions. According to (3) and (5), a compact representation of quasiARX RBFN model is given as (6)yt=∑j=0MφTtΩjNpj,φt+et,in which the set of RBFs with scaling parameter λ (the default value of λ is 1) is (7)Npj,φt=expφtμj2λσj2j≠01j=0.
3. Parameter Estimation of QuasiARX RBFN Model
From (6) and (7), it is known that pj (i.e., μj, σj) for j=1,…,M and are M nonlinear parameters for the model, whereas Ωj(j=0,…,M) become linear when all the nonlinear parameters are determined/fixed. In the following, the clustering method and SVR are, respectively, applied to estimate those two types of parameters.
3.1. Nonlinear Parameters Estimation
The choice of the center parameters plays an important role in performance of the RBFtype model [29]. In this paper, these parameters are estimated by means of prior knowledge from the clustering method rather than by minimizing the mean square of the training error. It should be mentioned that using the clustering method for initializing the center parameters is not a new idea in RBFtype models, and sophisticated clustering algorithms have been proposed in [30, 31]. In the present work, nonlinear parameters are estimated in a clustering way, which have meaningful interpretations. From this point of view, (6) is investigated as a local linear model with M submodels yj=φT(t)Ωj(j=1,…,M), and the jth RBF N(pj,φ(t)) is regarded as a timevarying interpolation function for associated linear submodel to preserve the local property. Figure 2 gives a schematic diagram to illustrate the quasiARX RBFN model via a local linear mean.
A local linear interpretation for the quasiARX RBFN model. A onedimensional nonlinear system is approximated by three linear submodels, whose operating areas are decided by the associated RBFs; meanwhile, the RBFs also provide interpolations or weighs for all the linear submodels dependent on the operating points. A main interpolation is obtained from a linear submodel when the operating point is near to the center of the associated RBF, while only a minor one can be received when the operating point is far from the corresponding RBF.
In this way, the local linear information of the data can be generated by means of clustering algorithm, where the number of clusters (linear subspaces) is equivalent to the number of RBF neurons, and each cluster center is set as the center parameter of the associated RBF. In order to determine appropriately the operating area of each local linear submodel, width of each RBF is set to well cover the corresponding subspace. Generally speaking, we can set the width parameters of the RBF neurons according to the distances among those centers. For instance, a proper width parameter σj of certain RBF can be obtained as a mean value of distances from its center μj to its nearest two others. From (7), one knows that an excessive small value of the width parameters may result in insufficient local linear operating areas for all data, while a wideshape setting will make all the RBFs overlapped and hence the local property of each linear submodel is weakened.
Remark 1.
Figure 2 only gives a meaningful interpretation of the model parameters. In real applications, since the data distribution is complex and the exact local linear subspaces may not exist, the clustering partition approach is used to provide several rational operating areas, and the scaling parameter λ can be set to adjust the width parameters for good weighting to each associated area.
3.2. Linear Parameters Estimation
After estimating and fixing the nonlinear parameters, (6) can be rewritten in a linearinparameter manner as (8)yt=ΦTtΘ+et, where Φ(t) is an abbreviation of Φ(φ(t)) with (9)Φt=φTt,N1tφTt,…,NMtφTtT,(10)Θ=Ω0T,Ω1T,…,ΩMTT,in which, since pj in the jth RBF has already been estimated, we represent the jth RBF N(pj,φ(t)) by a shorten form as Nj(t) in (9). Therefore, the nonlinear system identification problem is reduced to a linear regression one with respect to Φ(t), and all the linear parameters are denoted by Θ.
Remark 2.
As a result of nonlinear parameter estimation, Φ(t) plays an important role in transforming the quasiARX RBFN models from nonlinearinnature to linearinparameter with respect to Θ. Accordingly, it also transforms the nonlinear mapping from the original input space of g(·) into a high feature space; that is, φ(t)→Φ(t). This explicit mapping will be utilized for an innerproduct kernel in the later part.
In the following, the linear parameters are estimated by a linear SVR, considering the structural risk minimization principal as(11)minJ≜12ΘTΘ+C∑t=1Nξt+ξt∗subject to (12)ytΦTtΘ≤ϵ+ξt,yt+ΦTtΘ≤ϵ+ξt∗,where N is the number of observations, ξt≥0 and ξt∗≥0 are slack variables, C is a nonnegative weight determining how much the prediction errors are penalized, which exceeds the threshold value ϵ. The solution can be transformed to find a saddle point of the associated Lagrange function: (13)LΘ,ξt,ξt∗,αt,αt∗,βt,βt∗≜12ΘTΘ+C∑t=1Nξt+ξt∗+∑t=1NαtytΦTtΘϵξt+∑t=1Nαt∗yt+ΦTtΘϵξt∗∑t=1Nβtξt+βt∗ξt∗, where αt, αt∗, βt, and βt∗ are nonnegative parameters to be designed later. The saddle point could be acquired by minimizing L with respect to Θ, ξt∗, and ξt: (14a)∂L∂Θ=0⟹Θ=∑t=1Nαtαt∗Φt,(14b)∂L∂ξt∗=0⟹βt∗=Cαt∗,(14c)∂L∂ξt=0⟹βt=Cαt.Thus, one can convert the primal problem (11) into an equivalent dual problem as (15)maxWαt,αt∗≜12∑t,k=1Nαtαt∗αkαk∗ΦTtΦk+∑t=1Nαtαt∗ytϵ∑t=1Nαt+αt∗ subject to (16)∑t=1Nαtαt∗=0,αt,αt∗∈0,C.To do this, the training results α^t and α^t∗ are obtained from (15), and the linear parameter vector Θ is then obtained by the training value: (17)Θ=∑t=1Nα^tα^t∗Φt.
In the above way, contributions of the SVRbased linear parameter estimation method can be concluded as follows.
The robust performance for parameter estimation is introduced because of the structural risk minimization of SVR.
There is no need to calculate the linear parameter Θ directly. Instead, it becomes a dual form of the quadratic optimization, which is represented by utilizing αt and αt∗ depending on the size of the training data. It is very useful to alleviate the computational cost especially when the model suffers from the curseofdimensionality.
Identification of quasiARX model is specified as an SVR with explicit kernel mapping Φ(t), which has been mentioned in Remark 2. To this end, the quasiARX RBFN model is reformalized as (18)yt=ΦTt∑t′=1Nα^t′α^t′∗Φt′=∑t′=1Nα^t′α^t′∗Kt,t′,
where t′ is time of training data, and a quasilinear kernel, which is explicitly explained in the following remark, is defined as an inner product of the explicit nonlinear mapping Φ(t): (19)Kt,t′=ΦTtΦt′=φTtφt′∑i=0MNitNit′.Remark 3.
The quasilinear kernel name is twofold. Firstly, it is derived from the quasiARX modeling scheme. Secondly, from (19) it is known that when M is as small as zero, the kernel is reduced to a linear one, and nonlinearity of the kernel mapping is improved when increasing the value of M. Compared with conventional kernels and with implicit kernel mapping, the nonlinear mapping of the quasilinear kernel is turnable by M, which also reflects the nonlinearity of the quasiARX RBFN models in the sense of the number of local linear subspaces utilized. A proper value of M is essentially helpful to cope with the potential overfitting which will be shown in the following simulations.
4. Experimental Studies
In this section, identification performance of the above proposed approach to quasiARX RBFN model is evaluated by three examples. The first one is an example to show the performance of quasiARX RBFN model for time series prediction. Second, a rational system generated from Narendra and Parthasarathy [17] is simulated with a small amount of training data, which is used to demonstrate the generalization of the proposed quasilinear kernel. At last, an example modeling a hydraulic robot actuator is carried out for a general comparison.
In the nonlinear parameter estimation procedure, affinity propagation (AP) clustering algorithm [32] is utilized to partition the input space and automatically generate the size of clusters in terms of data distribution, where Euclidean distance is evaluated as the similarity between exemplars. Then, centers of all clusters are selected as the RBF center parameters in the quasiARX model, and the width parameter of a certain RBF is decided as the mean value of distances from the associated center to the nearest two others. For the linear parameter estimation, LibSVM toolbox [33] is applied, where νSVR is used with default ν setting by Matlab 7.6. Finally, the model performance is evaluated by root mean square error (RMSE) as (20)RMSE=∑tyty^t2K,where y^(t) is the prediction value of the system output y(t) and K is the number of regression vectors.
4.1. Modeling the MackeyGlass Time Series
The time series prediction on the chaotic MackeyGlass differential equation is one of the most famous benchmarks for comparing the learning and generalization abilities of different models. This time series is generated from the following equation: (21)dxtdt=axtτ1+x10tτbxt,where a=0.2, b=0.1, and τ=17, which are the most often used values in the previous research, and the equation does show chaotic behavior with them. To make the comparisons fair with the earlier works, we will predict x(t+6) using the input variables x(t), x(t6), x(t12), and x(t18). Two thousand data points are generated with initial condition taken as x(t)≡1.2 for t∈[17,0] based on the fourthorder Runge–Kutta method with time step Δt=0.1. Then, one thousand inputoutput data pairs are selected from t=201 to t=1200, which is shown in Figure 3. The first 500 data pairs are used as training data, while the remaining 500 are used to predict x(t+6) followed by (22)xt+6=φTtΩ0+∑j=1M^φTtΩjNμ^j,σ^j,φtwith(23)Nμ^j,σ^j,φt=expφtμ^j2σ^j2,where φ(t)=[x(t18),x(t12),x(t6),x(t)]T.
Time series generated from the MackeyGlass equation.
The prediction of the MackeyGlass time series using a quasiARX RBFN model starts, where 20 clusters are obtained from the AP clustering algorithm, and thus 20 RBF neurons are correspondingly constructed. Thereafter, SVR is used for linear parameter estimation, in which the superparameter C is set as 100. The predicted result is compared with the original time series of test data in Figure 4, which gives a RMSE of 0.0091.
Prediction result with the quasiARX RBFN model.
In Figure 4, the predicted result fits the original data very well; however, it is still not as good as the results from some famous models/methods listed in Table 1. Since no disturbance is contained in this example, it is found that the prediction performance can be easily improved by minimizing the training prediction error. In the comparison list, SNPOM for RBFAR model, hybrid learning method for local linear wavelet neural network (LLWNN), and genetic algorithm (GA) for RBFN are all optimizationbased identification methods, and it is relatively easy for them to achieve small RMSEs of the prediction by iterative training. However, these methods are much more timecosting in comparison with only 6 seconds by the proposed method for the quasiARX RBFN model. In addition, although the kmeans clustering method for RBFN is implemented in a deterministic way and shows efficient result, the number of RBF neurons used is as big as 238. In fact, a small prediction RMSE obtained from these methods does not mean good identification of the models, since overtraining may happen some times.
Results of different models for MackeyGlass time series prediction.
Model
Method
Number of neurons
RMSE
Autoregressive model
Least square
5
0.19
FNT [8]
PIPE
Not provided
7.1 × 10^{−3}
RBFAR [9]
SNPOM
25
5.8 × 10^{−4}
LLWNN [10]
PSO + gradient decent algorithm
10
3.6 × 10^{−3}
RBF [11]
kmeans clustering
238
1.3 × 10^{−3}
RBF [12]
GA
98
1.5 × 10^{−3}
QuasiARX RBFN model
Proposed
20
9.1 × 10^{−3}
QuasiARX RBFN model
Proposed + SNPOM
20
2.1 × 10^{−3}
In the present example, we confirm the effectiveness of the optimizationbased method given above and propose a hybrid approach for identification of the quasiARX RBFN model, where prediction result from the proposed method can be further improved by SNPOM (the function “lsqnonlin” in the Matlab Optimization Toolbox is used [9]). It is seen that the prediction RMSE can be improved to 2.1×103 by only 15 iterations of implementation in SNPOM, and the result becomes compatible with others. However, such optimization is not always effective, especially in model simulations on testing data, such as in the model x(t)=f(x^(t1),x^(t2)), where x^(t1) is the prediction value of x(t1). In the following, a rational system is evaluated by simulated quasiARX RBFN models to show advantages of the proposed method.
4.2. Modeling a Rational System
Accurate identification of nonlinear systems usually requires quite long training sequences which contain a sufficient amount of data from the whole operating region. However, as the amount of data is often limited in practice, it is important to study the identification performance for shorter training sequences with a limited amount of data. The system under study is a nonlinear rational model described as(24)yt=fyt1,yt2,yt3,ut1,ut2+et,where (25)fx1,x2,x3,x4,x5=x1x2x3x5x31+x41+x22+x32and e(t)∈(0,0.01) is the white noise.
Difficulty of this example lies in the fact that only 100 samples are provided for training, which is created by 100 random sequences distributed uniformly in the interval [1,1], while 800 testing data samples are generated from the system with input:(26)ut=sin2πt250ift≤5000.8sin2πt250+0.2sin2πt25otherwise.The excited training signal u(t) and system output y(t) are illustrated in Figure 5.
Training data for rational system identification.
In this case, 9 clusters are automatically obtained from the AP clustering algorithm; then the nonlinear parameters μj and σj(j=1,…,9) of the quasiARX RBFN model are estimated as Section 3 described. SVR is utilized thereafter for linear parameter estimation, where superparameters are set with different values for testing. Following the training, the simulated model is (27)yt=φTtΩ0+∑j=1M^φTtΩjNμ^j,σ^j,φtwith (28)Nμ^j,σ^j,φt=expφtμ^j2σ^j2,where φ(t)=[y^(t1),y^(t2),y^(t3),u(t1),u(t2)]T and y^(tn) denotes the simulated result in the previous n step. Figure 6 simulates the quasiARX RBFN model on the testing data, which gives a RMSE of 0.0379 under the superparameter C=10.
Simulated result with the quasiARX RBFN model for rational system.
Due to the fact that identification of the quasiARX RBFN model can be regarded as an SVR with quasilinear kernel, a general comparison is given to show advantages of the quasiARX RBFN model from SVRbased identification. Not only the short training sequence but also a long sequence with 1000 pairs of samples, which is generated and implemented in the same manner as the short one, is applied for comparing. Table 2 presents the comparison results of the proposed method (i.e., SVR with quasilinear kernel), SVR with linear kernel, SVR with Gaussian kernel, and quasiARX model identified directly by an SVR (QARX SVR), where various choices of SVR superparameters C and γ for Gaussian kernel are provided. From the simulation results under a short training sequence (100 samples), it is seen that when the design parameters are optimized, SVR with quasilinear kernel performs much better than the ones with Gaussian kernel and linear kernel, and the quasilinear kernel also performs little sensitively with respect to the SVR superparameter setting. Moreover, although the QARX SVR method utilizes the quasiARX model structure, it only provides a similar simulation RMSE to SVR with Gaussian kernel. However, these simulation results cannot be resorted to refute the effectiveness of the SVR with Gaussian kernel and QARX SVR method for nonlinear system identification. In the simulations, for a long training sequence (1000 samples), it is found that QARX SVR method outperforms all the others, and SVR with Gaussian kernel also performs much better than the ones with quasilinear and linear kernel.
Simulated results of the SVRbased methods for rational system.
Method
Superparameters
RMSE
C
γ(Gaussian)
Short training sequence
Long training sequence
Proposed
1

0.0546
0.0287
10

0.0379
0.0216
100

0.0423
0.0216
SVR + linear kernel
1

0.0760
0.0710
10

0.0764
0.0708
100

0.0763
0.0708
SVR + Gaussian kernel
1
0.01
0.1465
0.0560
0.05
0.0790
0.0426
0.1
0.0808
0.0421
0.5
0.0895
0.0279
10
0.01
0.0782
0.0376
0.05
0.0722
0.0409
0.1
0.0866
0.0365
0.5
0.0699
0.0138
100
0.01
0.0722
0.0352
0.05
0.0859
0.0376
0.1
0.0931
0.0313
0.5
0.1229
0.0340
QARX SVR [13]
1
0.01
0.0698
0.0362
0.05
0.0791
0.0384
0.1
0.0857
0.0345
0.5
0.0749
0.0116
10
0.01
0.0783
0.0412
0.05
0.0918
0.0328
0.1
0.0922
0.0242
0.5
0.1483
0.0338
100
0.01
0.0872
0.0400
0.05
0.1071
0.0237
0.1
0.8186
0.0166
0.5
0.1516
0.0487
On the other hand, from the perspective of the performance variation caused by different training sequences, histograms of simulated error for SVRbased methods are given in Figure 7, where performance of simulations is illustrated using, respectively, the short training sequence and the long training sequence. It indicates that the SVR with linear kernel has the most robust performance to amount of training data, and the robust performance is also found in the quasilinear kernel compared with the Gaussian kernel and QARX SVR method, where significant deterioration is found in the simulations when a limited amount of training samples are used. This result implies that Gaussian kernel and QARX SVR may be overfitted since the implicit nonlinear mapping is carried out, which has strong nonlinear learning ability but with no idea about how “strong” the nonlinearity need is. In contrast, the truth behind the impressive and robust performance of the quasilinear kernel is that prior knowledge is utilized in the kernel learning (nonlinear parameter estimation), and a number of parameters are determined in terms of data distribution, where complexity of the model (nonlinearity) is tunable according to the number of local linear subspaces clustered. In other words, the quasiARX RBFN model performs in a local linear way; hence it can be trained in a multilinear way, better than some unknown nonlinear approaches for the situation with insufficient training samples.
Histograms of the simulated errors. The horizontal coordinate in each subfigure denotes the simulated error of the model, whose elements are binned into 10 equally spaced containers. Four models are trained by using both short (a) and long (b) training sequences, then the simulated performance variation can be investigated by comparison.
Moreover, the RBFAR model is utilized with SNPOM estimation method for this identification problem, where the number of RBF neurons are determined by trailanderror, whose initial values are given randomly. Considering randomness of the algorithm, ten runs are implemented except that the results fail to be simulated, and the maximum iterations value in SNPOM is set to 50. Consequently, four RBFs are selected for RBFAR model, which gives a mean RMSE of 0.0696 using short training sequence, compared with the result of 0.0336 when the long training one is utilized. Although the parameter setting for this method may not be optimal, we can generate the same conclusion for the QARX SVR method, which is overfitted in the case of training by short sequence.
4.3. Modeling a Real System
This is an example modeling a hydraulic robot actuator, where the position of the robot arm is controlled by a hydraulic actuator. The oil pressure in the actuator is controlled by the size of the valve opening through which the oil flows into the actuator. What we want to model is the dynamic relationship between the position of the valve u(t) and the oil pressure y(t).
A sample of 1024 pairs of {y(t),u(t)} has been observed as shown in Figure 8. The data is divided into two equal parts, the first 512 samples are used as training data, and the rest are used to test the simulated model. For the purpose of comparison, the regression vector is set as φ(t)=[y(t1),y(t2),y(t3),u(t1),u(t2)]T. We simulate the quasiARX RBFN model on the testing data by (29)yt=φTtΩ0+∑j=1M^φTtΩjNμ^j,σ^j,φtwith (30)Nμ^j,σ^j,φt=expφtμ^j2λσ^j2,where φ(t)=[y^(t1),y^(t2),y^(t3),u(t1),u(t2)]T and λ is set as 50 heuristically due to the complex dynamics and data distribution in this case, which insures that the RBFs are wide enough to cover the whole space well. Similar setting of λ can also be found in the literature for the same purpose [34, 35].
Measurements of u(t) and y(t).
To determine the nonlinear parameters of the quasiARX RBFN model, AP clustering algorithm is implemented, and 11 clusters are generated automatically. Then, SVR is utilized for the linear parameter estimation. Finally, the model is identified and simulated in Figure 9 by the testing data, which gives a RMSE of 0.462. This simulation result is compared with the ones of linear ARX model, NN, WN, and SVRbased methods shown in Table 3. From Table 3, it is known that the proposed method outperforms the others for the real system. In addition, RBFAR model with SNPOM estimation method fails to be simulated in this case, where the number of RBF neurons is tested from 3 to 6, and their initial values are given randomly.
Comparison results for the real system.
Model
Superparameters
RMSE
C
γ(Gaussian)
ARX model

1.016
NN [1]

0.467
WN [6]

0.529
SVR + quasilinear kernel
1

0.462
5

0.487
10

0.491
SVR + Gaussian kernel
1
0.05
1.060
0.1
0.828
0.2
0.643
0.5
1.122
5
0.05
0.850
0.1
0.740
0.2
0.562
0.5
0.633
10
0.05
0.775
0.1
0.665
0.2
0.608
0.5
1.024
QARX SVR [13]
1
0.05
0.737
0.1
0.592
0.2
0.801
0.5
0.711
5
0.05
0.609
0.1
0.600
0.2
0.715
0.5
0.890
10
0.05
0.593
0.1
0.632
0.2
1.231
0.5
1.285
Simulated result with quasiARX RBFN model for the real system.
5. Discussions and Conclusions
The proposed method has a twofold role in the quasiARX model identification. For one thing, the clustering method has been used to uncover the local linear information of the dataset. Although similar methods have appeared in the parameter estimation of RBFNs, meaningful interpretation has been given here to the nonlinear parameters of quasiARX model in the manner of multilocal linear model with interpolations. In fact, explicit local linearity does not always exist in many real problems, whereas clustering can provide at least a rational multidimensional space partition approach. In the future, a more accurate and general space partition algorithm is to be investigated for identification of quasiARX models. For another, SVR has been utilized for the model’s linear parameter estimation; meanwhile, a quasilinear kernel is deduced and performed as a composite kernel. The parameter M in the kernel function (19) corresponds to the amount of subspaces partitioned, which is therefore preferred not to be a big value to cope with the potential overfitting.
In this paper, a twostep learning approach has been proposed for identification of quasiARX model. Unlike the conventional blackbox identification approaches, prior knowledge is introduced and makes sense for the interpretability of quasiARX models. By minimizing the training data error, linear parameters to the model are estimated. In the simulations, the quasiARX model is denoted in the form of SVR with quasilinear kernel, which shows great approximation ability as optimizationbased methods for quasiARX models but outperforms them when the training sequence is limited. Finally, the best performance of the proposed method has been demonstrated with a real system identification problem.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grants 81320108018 and 31570943 and the Six Talent Peaks Project for the High Level Personnel from the Jiangsu Province of China under Grant 2015DZXX003.
SjöbergJ.ZhangQ.LjungL.BenvenisteA.DelyonB.GlorennecP.Y.HjalmarssonH.JuditskyA.Nonlinear blackbox modeling in system identification: a unified overviewMachonGonzalezI.LopezGarciaH.Feedforward nonlinear control using neural gas networkNoëlJ. P.KerschenG.Nonlinear system identification in structural dynamics: 10 more years of progressNagamaniG.RamasamyS.BalasubramaniamP.Robust dissipativity and passivity analysis for discretetime stochastic neural networks with timevarying delaySutrisnoI.Jami’inM. A.HUJ.MarhabanM. H.A selforganizing Quasilinear ARX RBFN model for nonlinear dynamical systems identificationHuJ.HirasawaK.KumamaruK.A hybrid quasiARMAX modeling and identification scheme for nonlinear systemsLjungL.ChenY.YangB.DongJ.AbrahamA.Timeseries forecasting using flexible neural tree modelPengH.OzakiT.HagganOzakiV.ToyodaY.A parameter optimization method for radial basis function type modelsChenY.YangB.DongJ.Timeseries prediction using a local linear wavelet neural networkHarphamC.DawsonC. W.The effect of different basis functions on a radial basis function network for time series prediction: a comparative studyDuH.ZhangN.Time series prediction using evolving radial basis function networks with new encoding schemeToivonenH. T.TöttermanS.ÅkessonB.Identification of statedependent parameter models with support vector regressionHuJ.KumamaruK.HirasawaK.A quasiARMAX approach to modelling of nonlinear systemsHuJ.HirasawaK.A method for applying neural networks to control of nonlinear systemsWangL.ChengY.HuJ.Stabilizing switching control for nonlinear system based on quasiARX RBFN modelNarendraK. S.ParthasarathyK.Identification and control of dynamical systems using neural networksYoungP. C.McKennaP.BruunJ.Identification of nonlinear stochastic systems by state dependent parameter estimationGanM.PengH.PengX.ChenX.InoussaG.A locally linear RBFnetworkbased statedependent AR model for nonlinear time series modelingJanotA.YoungP. C.GautierM.Identification and control of electromechanical systems using statedependent parameter estimationPatraA.DasS.MishraS. N.SenapatiM. R.An adaptive local linear optimized radial basis functional neural network model for financial time series predictionGanM.PengH.ChenL.A globallocal optimization approach to parameter estimation of RBFtype modelsChengY.WangL.HuJ.Identification of quasiARX neurofuzzy model with an SVR and GA approachChengY.ChengY.WangL.HuJ.QuasiARX wavelet network for SVR based nonlinear system identificationÅkessonB. M.ToivonenH. T.Statedependent parameter modelling and identification of stochastic nonlinear sampleddata systemsHuB.G.QuH.B.WangY.YangS.H.A generalizedconstraint neural network model: associating partially known relationships for nonlinear regressionsSchölkopfB.SmolaA. J.PanchapakesanC.PalaniswamiM.RalphD.ManzieC.Effects of moving the centers in an RBF networkGonzálezJ.RojasI.PomaresH.OrtegaJ.PrietoA.A new clustering technique for function approximationGuillénA.PomaresH.RojasI.GonzálezJ.HerreraL. J.RojasF.ValenzuelaO.Studying possibility in a clustering algorithm for RBFNN design for function approximationFreyB. J.DueckD.Clustering by passing messages between data pointsChangC.LinC.LIBSVM: a Library for support vector machineshttp://www.csie.ntu.edu.tw/~cjlin/libsvm10.1145/1961189.19611992s2.079955702502OussarY.DreyfusG.Initialization by selection for wavelet network trainingBodyanskiyY.VynokurovaO.Hybrid adaptive waveletneurofuzzy system for chaotic time series identification