This paper considers the parameter estimation for linear timeinvariant (LTI) systems in an inputoutput setting with output error (OE) timedelay model structure. The problem of missing data is commonly experienced in industry due to irregular sampling, sensor failure, data deletion in data preprocessing, network transmission fault, and so forth; to deal with the identification of LTI systems with timedelay in incompletedata problem, the generalized expectationmaximization (GEM) algorithm is adopted to estimate the model parameters and the timedelay simultaneously. Numerical examples are provided to demonstrate the effectiveness of the proposed method.
The advanced process control theories have enjoyed rapid development in the past several decades to meet the growing demands of closedloop system performances, such as improved process safety and efficiency of plant operation, consistent product quality, and economic optimization [
Typically, the process data used in process modeling are generated by performing an identification experiment, in which a testing signal is designed and utilized to excite the process. Most of the conventional parameter estimation methods, such as prediction error method (PEM), instrumental variable (IV) method, and subspace method, assume that the identification data are sampled regularly and recorded properly. However, this is not always true in practical industry. For example, in the development of an inferential model for the sulfur content in the gas oil product, the sulfur concentration cannot be measured directly and the lab analysis is required which takes a long time. The process variable can be sampled in every minute, but the sulfur concentration is only available in every twelve hours. Another example is the industrial process with data transmission through the network. The recorded process data are corrupted by many networkinduced problems, such as transmission delay and packet dropout or missing. Therefore, parameter estimation with irregular data has not been extensively investigated in the literature.
Timedelays are commonly encountered in various engineering systems, such as chemical processes, mechanical systems, network control systems, transmission line, and economic systems [
Missing data problem is very common in process industry. A special example is the irregularly sampling system. Many critical parameters, such as the product concentration, steam quality, and
The work introduced in this paper aims at handling the identification problem of the LTI systems with missing output data in the presence of timedelay. The identification problem is formulated under the scheme of the generalized expectationmaximization (GEM) algorithm and the timedelay and missing output data are handled simultaneously. The GEM algorithm consists of expectation step (Estep) and maximization step (Mstep). In the Mstep, the maximization problem is transformed into an equivalent minimization problem and this problem is solved by using a general numerical optimization algorithm.
The rest of this paper is organized as follows. The problem statement is presented in Section
Consider the LTI system described by the following output error (OE) timedelay model:
The identification data
The GEM algorithm is a generalpurpose iterative optimization algorithm to derive the maximum likelihood (ML) estimate and it has attracted great attentions of the researcher due to its flexibility in handling the missing data or hidden state [
Estep: given the
Mstep: find the
The Estep and Mstep alternate until the relative change of the parameter estimate between neighboring iterations is smaller than a prespecified arbitrary small constant or the maximal iteration number is achieved.
Here, we treat the timedelay
Based on the Bayesian property, the likelihood function of the complete data set can be decomposed into
The term
Since the timedelay
Therefore, the conditional expectation of the log complete data density
The expectation is firstly taken with respect to the discrete variable
The expectation is then taken with respect to the continuous variable
In order to calculate
Therefore, the
In the Mstep of the GEM algorithm, the unknown parameters should be estimated to increase the
Here, we introduce the variable
The timedelay
Consider the following LTI timedelay system described by the OE timedelay model:
Estimated parameters after 13 iterations.
True value 





Proportion of missing output 




Full data set  −0.695  0.3026  3  0.0114 

−0.694  0.3042  3  0.0116 

−0.7012  0.2994  3  0.0121 

−0.6979  0.3017  3  0.0124 
The input and output data.
The estimated model parameters in each iteration.
The estimated noise variance in each iteration.
The Continuous Stirred Tank Reactor (CSTR) is a benchmark example used to test the performances of different modeling and control algorithms and the first principle model of the CSTR is described as [
The input and output data.
The selfvalidation results. The blue line is the real process data and the red line is the simulated output of the estimated model.
The crossvalidation results. The blue line is the real process data and the red line is the simulated output of the estimated model.
This paper considers the identification problem of LTI systems with irregular data set. The timedelay and the missing data are commonly encountered problems in process industry and the existence of these problems makes the process modeling a challenging task. The identification problem with incomplete data set in the presence of timedelay is formulated under the scheme of the GEM algorithm and the model parameters and the timedelay are estimated simultaneously in this algorithm. Numerical examples are presented to demonstrate the efficacy of the proposed method.
The authors declare that there is no conflict of interests regarding the publication of this paper.