Flexible Least Squares Algorithm for Switching Models

The self-organizing model and expectation-maximization method are two traditional identiﬁcation methods for switching models. They interactively update the parameters and model identities based on oﬄine algorithms. In this paper, we propose a ﬂexible recursive least squares algorithm which constructs the cost function based on two kinds of errors: the neighboring two-parameter estimation errors and the output estimation errors. Such an algorithm has several advantages over the two traditional identiﬁcation algorithms: it (1) can estimate the parameters of all the sub-models without prior knowledge of the model identities; (2) has less computational eﬀorts; and (3) can update the parameters with newly arrived data. The convergence properties and simulation examples are provided to illustrate the eﬃciency of the algorithm.


Introduction
Least squares (LS) algorithm is the most widely used method in parameter estimation [1][2][3]. It defines a cost function which is composed of the errors between the true outputs and the predicted outputs. en, the estimations can be obtained by solving the derivative function of the cost function. e LS algorithm has fast convergence rates but with the cost of heavy computational efforts [4,5]. In addition, the LS algorithm needs to compute the inverse of a matrix. If the matrix has a high order or is ill-conditioned, the LS algorithm is inefficient [6][7][8][9].
To reduce the computational efforts and to avoid the matrix inversion, the recursive least squares (RLS) algorithm is a good choice. e basic idea of the RLS algorithm is to update the parameter estimations using the newly arrived data, that is, the cost function of the RLS algorithm is composed of only one set of data rather than all the collected data [10][11][12]. erefore, the RLS algorithm has less computational efforts, and it does not require calculating the matrix inverse. However, the RLS algorithm has slow convergence rates when compared with the LS algorithm [13,14]. With the aim to increase the convergence rates, many modified RLS algorithms are developed, e.g., the multi-innovation RLS algorithm [15,16] and the hierarchical RLS algorithm [17,18].
Although the RLS algorithm and its modified counterparts can identify systems with less computational efforts and fast convergence rates, they have the assumption that the considered model is a single model. If the system is described by a switching model, those algorithms are inefficient. Switching models are widely used in engineering practices [19,20]. Such models have several modes with different dynamical properties, and the modes are associated with various operating conditions [21,22]. e difficulty in switching system identification is that the times of the operating points (model identities) may be unknown. To identify the switching models, one should first determine the operating points/model identities.
e SOM method introduces several cost functions which correspond to each sub-model in each iteration, and the smallest cost function is associated with the true model in this sampling instant [23]. e EM algorithm regards the model identities as hidden variables and updates these identities in the EM-E step; once the identity estimations are obtained, the parameter estimations are computed in the EM-M step. ese two steps run interactively until these two kinds of estimations converge to their true values [27,28]. e SOM method should compute several cost functions in each iteration, and the EM algorithm needs to compute the model identities first. In addition, both these algorithms are offline algorithms. at is, they have heavy computational efforts and cannot update the parameters based on newly arrived data. e flexible least squares (FLS) algorithm, first developed by Kalaba and Tesfatsion [29], is used for timevarying system identification. Its cost function contains two parts: one is the error of the two parameters in the two neighboring instants, and the other is the error between the true outputs and predicted outputs [30][31][32]. Due to the first error, the parameter estimations can catch the varying parameters. Inspired by the FLS algorithm, we develop a novel FLS algorithm for switching models.
is algorithm is termed as flexible recursive least squares (FRLS) algorithm. Compared with the SOM, EM, and FLS algorithms, this algorithm has the following advantages: (1) the FRLS algorithm is an online algorithm, and thus it can update the parameters with newly arrived data; (2) the FRLS algorithm has less computational efforts; and (3) the FRLS algorithm can estimate the parameters of all the sub-models without prior knowledge of the model identities.
e remainder of the paper is organized as follows. Section 2 explains the switching model and traditional identification algorithms. Section 3 proposes the offline FLS algorithm and online FLS algorithm. Section 4 provides several simulation examples. Finally, Section 5 summarizes the paper and gives some future directions.

Problem Statement
Let us define some notations first: I means an identity matrix of the appropriate sizes; the superscript T stands for the matrix transpose; the norm of a matrix X is defined as ‖X‖ � ��������� λ max [XX T ]; λ max [XX T ] means the maximum eigenvalue of matrix XX T ; and the norm of a vector z � [z 1 , z 2 , . . . , z n ] T ∈ R n is defined as ‖z‖ � ( n i�1 z 2 i ) 1/2 .

Switching Model.
Consider the following switching model: where y i (t) is the output of the i-th model; φ i (t) ∈ R m i is the information vector of the i-th model, which is composed of the input and output data before the sampling instant t; ϑ i is the parameter vector of the i-th model; v i (t) is a Gaussian white noise and satisfies v i (t) ∼ N(0, σ 2 i ); and N is the number of the sub-models.
In the sampling instant t, there is no knowledge of the identity of the model. We aim to estimate the parameter vectors ϑ i , i � 1, . . . , N based on the collected data.
Collect L sets of input and output data, and define the following cost function: where w i (t) is the model identity in the sampling instant t. For example, in the sampling instant t, the true model is the s -th model, and then the true values of the identities of all the sub-models are w 1 (t) � 0, w 2 (t) � 0, · · ·, w s−1 (t) � 0, w s (t) � 1, w s+1 (t) � 0, . . . , w N (t) � 0. To estimate the parameters, the following assumptions are introduced.
Assumption 1. e number of the collected data is larger than the number of the unknown parameters, that is, In addition, assume that the number of data of the i-th sub-model is L i ; then, Assumption 2. For the switching model proposed in (1), all the input data are taken as persistent excited.
Assumption 3. All the sub-models have the same information vector but different parameter vectors, that is, the switching model can be written by Remark 1. Assumptions 1 and 2 can ensure that the information matrices of all the sub-models are nonsingular [6]. Assumption 3 can also be easily obtained [33]. For example, for a switching model with unknown structures, we can use the kernel method to describe the model, and all the sub-models approximated by using the kernel method can have the same structure.

Traditional Identification Algorithms.
Rewrite the cost function of the switching model as follows: Assume that the parameter estimations and identity estimations in iteration k − 1 are ϑ Both the SOM and EM algorithms estimate the parameters through two steps: (1) Estimate the model identity estimations w k i (t), i � 1, 2, . . . , N, t � 1, 2, . . . , L, based on the parameter estimations ϑ 2 Complexity e difference between the SOM and EM algorithms is in the first step. In the SOM algorithm, the model identity estimate w k i (t) is 1 or 0. For example, in iteration k of the sampling instant t, let en, w k s (t) � 1 and the other identity estimations On the other hand, in the EM algorithm, let en, the identity estimate w k j (t) can be computed by Remark 2. Both the SOM and EM algorithms are offline algorithms; if the order of the system is large, their computational efforts are heavy. In addition, they cannot update the parameters with newly arrived data [23,34].

Flexible Recursive Least Squares Algorithm
e SOM and EM algorithms update the parameters through two steps, and these two steps are related to each other. If one kind of estimations has poor estimation accuracy, the other may be also poor or divergent. In this section, we use the FLS algorithm for the switching models, which can estimate the parameters without prior knowledge of the model identities.

Offline FLS Algorithm. Define
en, the switching model can be written as Let Unlike the SOM and EM algorithms, the cost function of the offline FLS algorithm is written by Using the FLS algorithm to update the parameters yields Remark 3. From equation (5) When the high-order matrix is singular or ill-conditioned, computing its inverse is impossible.

Flexible Recursive Least Squares Algorithm.
To reduce the computational efforts and to avoid a high-order matrix inversion, this section proposes an online FLS algorithm which is termed as flexible recursive least squares (FRLS) algorithm.
Assume that the parameter vector in the sampling instant t − 1 is ϑ(t − 1). Define the following cost function: Complexity 3 In the sampling instant t, all the parameter estimations before t have been obtained, and thus (15) is simplified as Taking the derivative of J(ϑ(t)) with respect to ϑ(t) yields Next, we use the recursive method to obtain the relationships between ϑ(t) and ϑ(t − 1).
(17) is transformed into en, the FRLS algorithm can be summarized as follows: Remark 5. Compared with the O-FLS algorithm, the FRLS algorithm performs a low-order (m − order) matrix inversion rather than a high-order (Lm − order) matrix inversion. erefore, the FRLS algorithm has less computational efforts than the O-FLS algorithm (Algorithm1). en, the steps of the FRLS algorithm are listed as follows.
In the FRLS algorithm, there exists a dense matrix inversion, which leads to heavy computational efforts. To further reduce the computational efforts, the following lemma is introduced.

Lemma 1.
For the matrices A ∈ R n×n , B ∈ R n×r , and C ∈ R r×n , if the matrix A is nonsingular, the following equality holds: Proof. For the matrix (A + BC), we have In addition, we can obtain 4 Complexity en, the proof is completed. According to Lemma 1, the matrix Q(t) is simplified as Remark 6. Based on equation (15), in each sampling instant, a dense matrix inversion is transformed into vector multiplication. erefore, the computational efforts are reduced.

Convergence Properties of the Two Kinds of FLS
Algorithms. e convergence properties of the O-FLS and FRLS algorithms are given in this section which can help the researchers follow these two algorithms.

Convergence Property of the O-FLS Algorithm
where V(L) is Gaussian white and independent on Φ(L), and the above equation can be written by Since Φ(L) ∈ R Lm×L , the matrix Φ(L)Φ T (L) is singular, and the matrix Ω cannot be a zero matrix. erefore, the O-FLS algorithm is a biased algorithm.

Remark 7.
A small μ can get more accurate parameter estimations. However, a small μ may lead to slow convergence rates between the two neighboring sub-models. erefore, we should assign different values for μ. For example, in the fixed interval, a small μ is better, while near the switching points, a larger one is better.

Convergence Property of the FRLS Algorithm
Theorem 2. For the switching model proposed in (1), the parameter estimations ϑ(t) updated by the FRLS algorithm are expressed by (10)- (12). en, the sequence ϑ(t) is convergent.

Complexity
Assume that the data from 1 ⟶ L 1 , L 1 ≫ m belong to model 1, and subtracting the true value ϑ 1 on both sides of the above equation yields For the reason that we have erefore, the FRLS algorithm is convergent.

Remark 8.
e FRLS algorithm has the assumption that the identities of the data are unchanging in a fixed interval. If the identities are changing continually, the FRLS algorithm is divergent.

Example 1.
Consider the following switching model: Sub − model2: Let

Complexity
In simulation, we collect 500 sets of input and output data, where the data from 1: 250 belong to model 1, and those from 251: 500 belong to model 2.
Use the FRLS algorithm for this switching model. e parameter estimations are shown in Figures 1 and 2. e predicted outputs and the true outputs, and their errors are shown in Figure 3. In addition, apply the EM and SOM algorithms for the switching model, where the initial identities for each sub-model are ω 0 j (t) � 1/2, j � 1, 2 and t � 1, 2, . . . , 500. e estimation errors and elapsed times of the three algorithms are shown Table 1.
From this simulation, we can get the following findings: (1) e parameter estimations using the FRLS algorithm can asymptotically converge to the true values (see Figures 1 and 2). (2) e predicted outputs using the FRLS algorithm can catch the true outputs (see Figure 3). (3) e number of the data in a fixed interval must be larger than the number of the unknown parameters. (4) All the FRLS, EM, and SOM algorithms are effective for the switching model, but the FRLS algorithm has the smallest elapsed times, that is, the FRLS algorithm has the least computational efforts among these three algorithms, as shown in Table 1.

Example 2: A Switching Open Channel
System. In this section, we consider an open channel system, which is shown in Figure 4. e radius of the channel is R, the length of the channel is x, u(t) is the discharge at the upstream end, y(t) is the discharge at the downstream end, and the slope is β. To ensure the discharge y(t) to flow in a fixed speed, we should control u(t).    channel system. ese two slopes lead to two different dynamics which should be described by two models: [14]: Sub − model1: t � 1, 2, . . . , 1000, Sub − model2: t � 1001, 1002, . . . , 2000.

(33)
We collect 2000 sets of input-output data using Matlab software, where the sequence u(t) { } is generated by e data from t � 1: 1000 belong to model 1, and those from 1001: 2000 belong to model 2.
Furthermore, we use the traditional EM and SOM algorithms for the switching open channel system (ω 0 j (t) � 1/2, j � 1, 2 and t � 1, 2, . . . , 2000). e parameter estimations and their estimation errors are shown in Figures 6 and 7. e elapsed times of these three algorithms are shown Table 4.
is example shows that (1) all the FRLS, EM, and SOM algorithms are convergent, as shown in  (2) the FRLS algorithm has the smallest elapsed times among these three algorithms, and this is shown in Table 4, that is, the FRLS algorithm has the least computational efforts among these three algorithms.

Conclusions
An online FLS algorithm, termed as flexible recursive least squares (FRLS) algorithm, is proposed for switching models in this study. Its cost function is composed of the errors between the two neighboring parameter estimations and the errors between the true outputs and the predicted outputs. With the help of the two neighboring parameter estimation errors, the operating points of the switching models can be determined, and the parameters of each sub-model can also been obtained. Compared with the SOM and EM algorithms, the FRLS algorithm can estimate the parameter estimations without prior knowledge of the model identities. In addition, the FRLS algorithm is an online algorithm, which has less computational efforts and can update the parameters with newly arrived data.
Although the FRLS algorithm has several advantages over the traditional identification algorithms, several challenging issues about the FRLS algorithm need to be considered in future. For example, if the sub-models switch continually, how to apply the FRLS algorithm to the switching model? How to choose a suitable μ to make the FRLS algorithm converge quickly to the true values? ese topics remain as open problems.

Data Availability
All data generated or analyzed during this study are included in this article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.