Multiple Model Identification for a High Purity Distillation Column Process Based on EM Algorithm

Due to the strong nonlinearity and transition dynamics between different operating points of the high purity distillation column process, it is difficult to use a single model for modeling such a process.Therefore, the multiple model based approach is introduced for modeling the high purity distillation column plant under the framework of the expectation maximization (EM) algorithm. In this paper, autoregressive exogenous (ARX) models are adopted to construct the local models of this chemical process at different operating points, and the EM algorithm is used for identification of local models as well as the probability that each local model takes effect. The global model is obtained by aggregating the local models using an exponential weighting function. Finally, the simulation performed on the high purity distillation column demonstrates the effectiveness of the proposed method.


Introduction
The increasing complexity of the process control systems poses a great challenge for process modeling and parameter identification and control.In practice, most of the industrial processes show a strong nonlinearity and a dynamic nature.Therefore, it is of interest for both academic researchers and industrial practitioners to study how to model these systems and to achieve satisfactory or even optimal control performance.Linear modeling technologies have been quite sophisticated after decades of development; however, due to the inborn nonlinearity of some production processes, the performance of single linear model based controllers or optimizers may be compromised or even unsatisfactory [1].To overcome the limitations imposed by the nonlinearity of the process, various strategies have been developed for nonlinear process modeling.Some of the popular modeling approaches have been proposed and applied in industrial processes, such as artificial neural networks [2,3], support vector regression and its extension least squares support vector regression [4], Gaussian process regression [5], and hybrid methods [6].The obvious feature of these methods is that they do not need any prior knowledge about the system.However, these methods do not always capture the process dynamics well and may lose validity when the system transits among a large range of operation.
The multiple model approach is a good compromise which produces complex models or controllers by piecing a number of simpler subsystems together.Some multiple modeling strategies and applications have been investigated because they are able to represent complex nonlinear processes using the linearization technique [7].The terminology of linear parameter varying (LPV) modeling method, which is featured by its linear structure and varying model parameters, was introduced by Shamma and Athans [8,9].The work on identification of multimodel LPV models can be found in much literature.For example, Zhu and Xu proposed an LPV model based on blended linear models and added weights at the input side [10,11].The identification of LPV models in an input-output setting with Box-Jenkins (BJ) model structure was addressed by Zhao et al. and prediction error method was employed for parameter interpolation [12].Huang et al. studied the work of the multiple LPV model identification using several weighting functions with two scheduling variables [13].The multiple model based recursive least square parameter identification was illustrated by Chen and Liu [14].Jin et al. proposed a multiple model LPV approach for nonlinear systems under the framework of 2 Mathematical Problems in Engineering the expectation maximization (EM) algorithm [15].It is an important step on LPV identification in which the model identity is treated as the missing variable in the EM scheme.Based on the EM algorithm, the local models are synthesized with process data to obtain a global model for approximating a nonlinear process.Deng and Huang extended the identification method to nonlinear parameter varying systems in which the local model takes the form of state space equation.The identification problem is solved by using the particle filter based on EM algorithm [16].The efficiency is illustrated on numerical examples and a pilot-scale hybrid water tank.Chen et al. continued to study the uncertain scheduling variable problem in multiple model based nonlinear identification.Both the linear and the nonlinear dynamics of scheduling variables are taken into account [17].
The distillation column is considered as one of the most important unit operations in chemical engineering [18,19].It becomes a benchmark problem in chemical process control for nonlinear modeling technology.The difficulties in modeling and controlling distillation columns lie in their highly nonlinear characteristics.Modeling of distillation columns may be classified into three groups: fundamental modeling, empirical modeling, and hybrid modeling [18].In this paper, a hybrid solution based on the multiple models is adopted, which combines the two main advantages of linear and nonlinear models.The distillation column model is a nonlinear global model but can be represented as a reunion of linear models, one linear model for each operating point (local model or submodel), where each local model works at different operating point and can be identified through the iteration procedure of the EM algorithm.Finally, the global nonlinear model can be obtained by using the exponential function which takes full account of each operating point.
The remainder of this paper is organized as follows: the high purity distillation column used to test the multiple model method is described in Section 2. Section 3 lays out the mathematical formulation of the multiple model approach in which the ARX model is used as the local model.In Section 4, the identification procedure under the EM algorithm is provided.The simulation on the high purity distillation column with multiple inputs is given in Section 5. Finally, conclusions are given in Section 6.

Description of a High Purity Distillation Column
Distillation is simply defined as a process in which a liquid or vapor mixture of two or more substances is separated into its component fractions of desired purity, by the application and removal of heat [20].The process is based on the fact that the vapor of a boiling mixture will be richer in the components that have lower boiling points.In most cases, the distillation is operated at a continuous steady state.New feed is always being added to the distillation column and products are always being removed.A high purity distillation column is strongly nonlinear, and any realistic study should take this into account [21,22].Distillation column has become a favorite subject in the process control engineering field, including the areas of soft-sensor, process modeling, and control.There are a variety of configurations for distillation columns, and each performs specific types of separations [23].
A typical two-product distillation column on a pilot plant scale is shown in Figure 1.Some important notation is summarized in Table 1.It is a nonlinear model of a distillation column with  − 1 theoretical stages including a reboiler plus a total condenser [24].In the model, the following assumptions are made: (1) two components (binary separation); (2) one feed and two products; (3) constant relative volatility; (4) constant molar flows (same vapor flow on all stages); (5) no vapor holdup; (6) total condenser.

Problem Formulation Using Multiple Model Method
The high purity distillation column as mentioned above is a classic chemical production process; the plant may transit among several operating points.A bank of ARX models is employed to approximate the nonlinear dynamic process around each operating point.Consider the following ARX model [15,17]: where {  ∈ R,  = 1, 2, . . ., } is the output.  is the Gaussian noise with zero mean and variance  added to the ARX model.  ∈ R  is the regressor vector, where {  ∈ R  ,  = 1, 2, . . ., } is the input of the nonlinear system,   and   are the orders of the output and input, and  =   +  ⋅   .Many industrial processes are often operated in certain "orderly" ways to meet different production objectives.Such orderly ways are also referred to as operating trajectory consisting of several predesigned operating points which are defined as the scheduling variable,  = Based on the fact that the observed output   at th sampling time instant directly depends on the regressor vector   as well as the data identity of   , (3) can be further written as Here  , is the normalized weight standing for the probability that the th submodel takes effect at time .An exponential function is adopted as where   is the scheduling variable at the sampling time  and   is defined as the validity width of the th local model which is limited between  min (the lower bound for   ,  = 1, 2, . . ., ) and  max (the upper bound for   ,  = 1, 2, . . ., ).
The missing data set is the random variable  = { 1 ,  2 , . . .,   } denoting the identity of each local model, namely,  mis = {}.The values of input variables, output variables, and the scheduling variable are observed or given a priori, so  obs = {,,, 1: }.The parameters to be estimated from process input and output data under the framework of the EM algorithm are Θ = {  ,   , } =1,2,..., .

The ARX Model Identification under the EM Algorithm
4.1.The EM Algorithm Review.The EM algorithm is an efficient iterative procedure for maximum likelihood estimation in the presence of missing data.The main strength of the EM algorithm lies in its ability to handle missing variables in the identification data set [25].After being first introduced by Raghavan et al., it has found extensive applications in various areas for finding maximum likelihood estimates of parameters in probabilistic models [26].In the EM procedure, both the complete data log likelihood ( obs ,  mis | Θ) and the conditional predictive distribution of missing data ( mis |  obs , Θ old ) are needed to be calculated.There are two steps, namely, the expectation step (E-step) and the maximization step (M-step).The EM algorithm proceeds as follows.
(1) Initialization: give an initial value of the model parameter vector.
(2) E-step: the -function is calculated, given the current parameter Θ estimated from the previous iteration, (3) M-step: maximize the -function with respect to Θ to obtain the new iterative parameter estimate.

Start
Data collection: Choose scheduling variable w(k) and set operating points T.
Set the initial values.
Satisfy the convergence condition? No

Yes
Obtain the optimal estimated parameters.
End (4) Iteration: step 2 and step 3 are repeated until the changes in parameters after each iteration are within a specified tolerance level.
The above steps ensure that the log likelihood function of the observed data increases at every step.So the EM algorithm is guaranteed to converge to some maximum of the likelihood function [15].The convergence performance under general conditions is given in Wu [27].The flowchart of the algorithm is shown in Figure 2.

Formulation of the
Using the Bayesian theory, the joint probability density function such as the first term of ( 7  Similarly, the model identity   can be determined from (5), so the second term of ( 7) can be simplified as  The derivation of the third term in ( 7) is based on the chain rule and the assumption that the scheduling variable   is known a priori and independent of the past observations as well as the model parameters.Based on the fact that most of the chemical plants predesign the operating points, namely, scheduling variable, ( | ,  1: , Θ) is derived as Then we can substitute (8), (9), and ( 10) into (7), and the joint probability density of the likelihood of the full data set can be rewritten as where The expectation of the complete data in the procedure of the EM algorithm is From ( 13) we need to compute the following probability density function: where (  |   , Θ  ) is calculated by (  =  |   ,  1: ,   ) is given by ( 5).
The last probability density function to be calculated denotes the conditional probability that the th data point comes from the th local model and can be derived from the Bayesian rule: 4.3.The Summary of the Identification Procedure.Now we give the identification procedure under the frame of the EM algorithm as follows.
(1) Initialization.Set the initial value of parameters to be estimated.
(3) M-step.Based on the three terms of the function (Θ|Θ old ), only the term as (15) depends on the local model parameter   and the process noise variance  2 .Derivative is taken over this term with respect to   and  2 , respectively.So the new parameter estimation can be calculated as follows: As a result, the new   of the th ARX local model and  2 are equal to For the local model validity width   ,  = 1, 2, . . ., , due to the usage of exponential function, it cannot be calculated from an analytical form directly.The optimization problem searching for the optimal value of   can be expressed as max Here a nonlinear numerical optimization is required to obtain the optimal   between the  min and  max .A constrained nonlinear optimization function named "fmincon" in optimization toolbox of "Matlab" is adopted to search for the optimal   at each of the iteration steps [15].
(4) Iteration.Repeat step (2) and step (3), until the converge condition is satisfied.Here the converge condition is defined as the changes of the estimated value between two iterations which are as small as the level 10 −6 .

A Distillation Column Simulation
Nonlinear system identification is much more complex than linear system identification.In this section, a high purity distillation column is used to test the performance of the multiple model approach based on the EM algorithm.
It has been shown that the purer the products get, the more nonlinear the system becomes [20].A distillation column is also a typical example for a MISO system in which there are strong interactions between the variables.In this paper, choosing LV-configuration is motivated by the fact that the LV-configuration is the most commonly used in industrial practice.
Here, the process input variables available for control purpose are the reflux () and the boil-up rate ().The top product composition () variable is the output.The disturbances to a distillation column can come from many sources.Besides the two manipulated variables, they can come from the feed (feed rate and feed composition), from the pressure inside the column and from the cooling water, and so forth.The distillation column is very sensitive to the feed rate disturbance.The feed rate is assumed to be measurable; therefore, its effect can be incorporated in the EM algorithm.In this paper, the feed rate  is treated as the scheduling variable [23].When the column is operated over a relatively wide operating region instead of at a single set point, it reveals a significant nonlinear behaviour.In the simulation, white noise with a variance of 0.05 is added in the simulated output data for unknown process noise.The observed input and output data for the high purity distillation column are shown in Figures 3 and 4. The scheduling variable is designed with three operating points at 0.8 kmol/min, 1.0 kmol/min, and 1.2 kmol/min, respectively, The top product composition which were given in Figure 5.In order to achieve high identification performance the random binary excitation signals with small magnitude are added in the input variables.The feed rate increases by a fixed step size and no additional excitation signal is added.Then applying the multiple model modeling approach and using the EM algorithm to estimate the parameters of each local model, the simulation results of self-validation tests indicate the efficiency of the proposed method.The comparison and the weighting function curve are given in Figures 6 and 7.
To further test the performance of the identified multiple model, the cross-validation is also carried on the case.The data used for cross-validation as shown in Figures 8 and 9 are generated under different feed rates from Figure 5. Two different operating points are selected in the cross-validation simulation as in Figure 10.It can be seen from Figure 11 that the real process data and the model prediction are in good agreement with each other which effectively confirms the efficiency of the proposed algorithm.

Conclusions
A multiple model modeling algorithm based on EM algorithm is proposed and applied for a high purity distillation

Figure 2 :
Figure 2: The flowchart of the EM algorithm for estimating the parameters.

Figure 6 :
Figure 6: The comparison of the distillation column model.

Table 1 :
Notation of the distillation column in reflux and boil-up construction.