Twin Support Vector Machine Method for Identification of Wiener Models

Twin support vector regression is applied to identify nonlinear Wiener system, consisting of a linear dynamic block in series with static nonlinearity.The linear block is expanded in terms of basis functions, such as Laguerre orKautz filters, and the static nonlinear block is determined using twin support vector machine regression. Simulation of a control valve model and pH neutralization process have been presented to show the features of the proposed algorithm over support vector machine based algorithm.


Introduction
The goal of system identification is to find a model, within a selected class of models, that produces the best predictions of a system's output.In general, one forms a cost function that depends on some norm of the prediction errors and finds the model that minimizes this cost function.Since the model is an approximation to the true system, there is a trade-off between the complexity of the structure of the model and the accuracy of its predictions.In many cases, linear models can be used to produce accurate predictions of a system behavior, particularly, if it is restricted to operating within a narrow region.However, if the model is required to cover a broader operating region, then a nonlinear model may be required [1].Block structured models, cascades of static nonlinearities and dynamic linear systems, are often a good trade-off as they can represent some dynamic nonlinear systems very accurately but are nonetheless quite simple.Common nonlinear models of this type are the Wiener and Hammerstein models [2].Many algorithms have been proposed to identify Wiener models [3][4][5][6][7][8][9][10][11].As one might notice from these researches, the extensive knowledge about linear time invariant (LTI) system representations was applied to the dynamic linear blocks.On the other hand, finding an effective representation for the nonlinearity is an active area of research.Traditionally, the nonlinearity is represented by a polynomial because it is simple and easy to estimate.However, the problem with polynomial approximation is that it cannot deal with many common nonlinearities (saturation, threshold, dead zone, etc.).Better approximation of these nonlinearities can be achieved by spline functions.However, spline functions are defined by a series of knot points which must either be chosen a priori or treated as model parameters and included in the (nonconvex) optimization.Neural networks are another tool to approximate nonlinear functions.Their powerful approximation abilities make them attractive.However, the need to specify the neural network topology in terms of the number of nodes and layers and the need to solve nonconvex optimization complicate their implementation.Recently, support vector machines (SVMs) and least squares support vector machines (LS-SVMs) have demonstrated powerful abilities in approximating linear and nonlinear functions [12,13].In contrast with other approximation methods, SVMs do not require a priori structural information.Furthermore, there are well established methods with guaranteed convergence (ordinary least squares, quadratic programming) for fitting LS-SVMs and SVMs [14].One of the common drawbacks of SVM based algorithms is that they are computationally heavy.Recently, a twin support vector machine (TSVM) regression algorithm has been proposed [15].The formulation of TSVM is very close to classical SVM except that it aims to enclose data points between two parallel planes such that each plane is closer to one class and is as far as possible from the other.Such formulation reduces the computation complexity which makes the TSVM one of the common methods in machine learning.Lately, many extensions of TSVM classifier have been proposed to improve its performance in certain aspects.A TSVM classifier has been extended to put the TSVM to multitask learning [16].A generalized framework of TSVM for learning from labeled and unlabeled data was investigated by [17,18].
Recently, Tötterman and Toivonen [19] have developed a new algorithm to identify Wiener models based on SVM regression, where the linear part is described by a basis filter expansion while the nonlinear part is represented by SVM.In this work, TSVM regression is used to formulate an identification algorithm for Wiener models.Simulation examples are presented to show the virtues of the proposed algorithm over Tötterman's algorithm.The outline of this paper is as follows: TSVM theory is reviewed in Section 2. In Section 3, an algorithm for the identification of Wiener models based on twin support vector machine is proposed.Section 4 presents two illustrative examples to test the proposed algorithm.In Section 5, concluding remarks are given.

Twin Support Vector Machines Regression
Twin support vector regression (TSVR) is obtained by solving the following pair of quadratic programming problems (QPPs): where  1 ,  2 > 0,     2 (, ) +  2 , which determines the  2 -insensitive up bound regressor.The end regressor is computed as the mean of these two functions.The geometric interpretation is given in Figure 1.The objective function of (1) or ( 2) is the sum of the squared distances from the shifted functions  =    1 (, ) +  1 +  1 or  =   2 (, ) +  2 −  2 to the training points.Therefore, minimizing it leads to the function  1 () or  2 ().The constraints require the estimated function  1 () or  2 () to be at a distance of at least  1 or  2 from the training points.That is, the training points should be larger than the function  1 () at least  1 , while they should be smaller than the function  2 () at least  2 .The slack variables  and  are introduced to measure the error whenever the distance is closer than  1 or  2 .The second term of the objective function minimizes the sum error variables, thus attempting to overfit the training points.
To find the solution of ( 1) and ( 2), the dual problems are needed to be derived.The optimization problems ( 1) and ( 2) just described are the primal problems for regression.To formulate the corresponding dual problems, the Lagrangian function  is written.Then,  is minimized with respect to the weight  and slack variables ,  and maximized with respect to the Lagrange multipliers  1 and  2 .By carrying out this optimization,  can be written in terms of Lagrange multipliers  1 and  2 .Finally, substituting the value of  and simplifying with the help of Karush-Kuhn-Tucker (KKT) the following dual problem is obtained: where That is, The solution of (5) requires the inverse of   .Sometimes, matrix may be ill-conditioned.This situation may be avoided by adding a regularization term,  > 0, to   .Here, "" is an identity matrix of suitable dimension.Therefore, ( 5) is reformulated as Similarly, problem ( 2) is considered and its dual is obtained as where  2 =  +  2 .Hence, problem (7) leads to Note that, in the duals ( 3) and ( 7), the inversion of matrix    of size ( + 1) × ( + 1) should be computed [15].Once the vectors  1 and  2 are known from ( 6) and ( 8) the two up and down bound functions are obtained.Then, the estimated regressor is constructed as follows: =     (, ) +   . (9)

Identification of Wiener Models
The Wiener cascade, a linear filter followed by a static nonlinearity as shown in Figure 2, is often used to represent certain higher-order nonlinear systems.In this section, the development in [19] is followed, up to the point where the SVM optimization is introduced (where TSVM is used).The output error Wiener model can be described as where [], [], [] ∈ R are the input and output of the linear block and the measured output signals, respectively, for  = 1, . . ., .The innovation [] is assumed to be white.Identifying output error models is computationally heavier than autoregressive exogenous (ARX) models even for linear systems.Several techniques have been suggested to deal with linear output error (OE) models such as instrumental variables (IV), subspace methods, algorithms based on the Steiglitz-McBride, and using orthonormal filter expansions like Laguerre or Kautz functions [20].In this paper the linear part will be represented by Laguerre filters.The truncated discrete Laguerre expansion of the transfer function ( −1 ) is given by for all  < 1, where the function ( −1 ) is assumed to be strictly proper, analytic in || > 1, and continuous in || ⩾ 1.
Remark 1.Notice that the finite impulse response FIR model can be obtained from (11) when  = 0. Substituting (11) into the first equation of ( 10) results in which can be written as where Applying such expansion to the linear part of Wiener model enables us to rewrite the Wiener model (10) as follows: Remark 2. If the nonlinearity was modeled as a polynomial of order then model ( 14) can be seen as bank of parallel linear filters   's followed by multi-input-single-output (MISO) nonlinear block as shown in Figure 3.
Remark 3. Our contribution is to replace the polynomial used in ( 14) by TSVM which gives

Example 1
To show the features of the proposed algorithm, the simulation example presented in [19] is identified.Consider a valve for fluid flow control described by where () is a pneumatic control signal applied to the valve, () is the position of the valve plug, and  0 () is the fluid flow.The fluid flow measurement () is given by where () consists of independent sequence of normally distributed random numbers with variance 0.0025.Two input-output sequences consisting of 1000 data points were generated as follows.A pseudorandom binary signal fluctuating between −1 and +1 with a basic clock period of seven sampling intervals was produced.Then, in each time interval, the input was multiplied by a uniformly distributed random factor between 0 and 0.4.Finally, a bias 0.5 was added on it to result in a signal having an amplitude between 0.1 and 0.9.This procedure was described and used in [19].The training and testing data sets are shown in Figures 4 and 5, respectively.The hyperparameters (, ,  1 ,  2 ,  1 ,  2 , and ) were selected based on cross-validation method where one parameter was varied and the others were kept fixed.The value of the varying parameter that gives the least root mean square error (RMSE) value is chosen as shown in Tables 1  and 2. The cross-validation process resulted in  1 = 0.04,  2 = 0.04,  1 = 100,  2 = 100, and  = 1.The Laguerre filter pole and order were chosen to be  = 0.1 and  = 7, respectively.Figure 6 shows TSVR based algorithm estimates together with measured output for the last 100 samples of the test data.It is clear from the figure that the algorithm produced accurate result.

Example 2
Consider pH neutralization process consisting of a continuous stirred tank reactor (CSTR) where a strong base (NaOH) reacts with the feed stream, a strong acid (HCl).The process is shown in Figure 7.The process input is the flow rate of the strong base and the process output is the pH of the effluent solution.The acid flow rate as well as the volume of the   tank is assumed to be constant.The identification algorithm described in Section 3 is used to estimate a Wiener model of the simulation system presented in [21].The nominal operating conditions of the system are given in Table 3.The model is highly nonlinear due to the implicit output equation, known as the titration curve.The system was excited with band limited white noise, with zero mean and 0.01 variance, around the nominal value of the base flow rate.The output of the system was corrupted with additive Gaussian white noise with zero mean and standard deviation  = 0.001.
The training and testing data sets are shown in Figures 8 and 9, respectively.The TSVM and SVM hyperparameters and the Laguerre filter pole and order were chosen based on cross-validation method.For TSVM, the cross-validation resulted in  1 = 0.08,  2 = 0.08,  1 = 100,  2 = 100, and  = 1 (see Table 5).In Table 4, the Laguerre filter pole and order were chosen to be  = 0.1 and  = 9, respectively.The SVM hyperparameters were chosen to be  = 0.08,  = 1000, and  = 1 as shown in Table 7 and the Laguerre filter pole and order were selected as  = 0.5 and  = 16 (see Table 6).By comparing the least RMSE (0.1248) and CPU time (21.36 sec.)values of the TSVR algorithm (Table 5) with the RMSE (0.1311) and CPU time (72.58 sec.)values of the SVR algorithm (Table 7), it is clear that TSVM algorithm outperforms the SVM algorithm in terms of accuracy and speed of computation.

Conclusion
In this paper a new algorithm for identification of Wiener systems using TSVR has been derived and used to identify Laguerre filter.The main advantage of the proposed algorithm over the method presented in [19] is that the proposed algorithm results in two smaller quadratic programs which is easier to solve than the quadratic program developed in [19].It was shown in Example 2 that TSVM algorithm is computationally lighter than SVM algorithm.

Figure 2 :
Figure 2: Block diagram of Wiener model.The investigator is assumed to have access to the input, (), and the output, (), but not the intermediate signals, () and ŷ(), or the innovation, ().

Figure 3 :
Figure 3: Wiener model with basis function expansion of linear block.

Table 1 :
TSVR: root mean square errors between true and estimated output in Example 1 for test data set using various filter poles and filter orders.The other hyperparameters were fixed as  1 =  2 = 0.08,  1 =  2 = 10, and  = 1.

Table 2 :
TSVR: root mean square errors between true and estimated output in Example 1 for test data set using various values of hyperparameters  1 =  2 ,  1 =  2 , and .The filter pole and filter order were fixed as  = 0.1,  = 7, respectively.

Table 4 :
TSVR: root mean square errors between true and estimated output in Example 2 for test data set using various filter poles and filter orders.The other hyperparameters were fixed as  1 =  2 = 0.08,  1 =  2 = 100, and  = 1.

Table 5 :
TSVR: root mean square errors between true and estimated output in Example 2 for test data set using various values of hyperparameters  1 =  2 ,  1 =  2 , and .The filter pole and filter order were fixed as  = 0.1,  = 9, respectively.

Table 6 :
SVR: root mean square errors between true and estimated output in Example 2 for test data set using various filter poles and filter orders.The other hyperparameters were fixed as  = 0.08,  = 100, and  = 1.

Table 7 :
SVR: root mean square errors between true and estimated output in Example 2 for test data set using various values of hyperparameters , , and .The filter pole and filter order were fixed as  = 0.5,  = 16, respectively.
simulation examples.The algorithm was able to model the nonlinearity, without requiring any a priori assumptions regarding its structure.The linear model was represented by