^{1}

^{2}

^{1}

^{2}

A focused time lagged recurrent neural network (FTLR NN) with gamma memory filter is designed to learn the subtle complex dynamics of a typical CSTR process. Continuous stirred tank reactor exhibits complex nonlinear operations where reaction is exothermic. It is noticed from literature review that process control of CSTR using neuro-fuzzy systems was attempted by many, but optimal neural network model for identification of CSTR process is not yet available. As CSTR process includes temporal relationship in the input-output mappings, time lagged recurrent neural network is particularly used for identification purpose. The standard back propagation algorithm with momentum term has been proposed in this model. The various parameters like number of processing elements, number of hidden layers, training and testing percentage, learning rule and transfer function in hidden and output layer are investigated on the basis of performance measures like MSE, NMSE, and correlation coefficient on testing data set. Finally effects of different norms are tested along with variation in gamma memory filter. It is demonstrated that dynamic NN model has a remarkable system identification capability for the problems considered in this paper. Thus FTLR NN with gamma memory filter can be used to learn underlying highly nonlinear dynamics of the system, which is a major contribution of this paper.

In any manufacturing process, where there is a chemical change taking place, a chemical reactor is at the heart of the plant. In size and appearance it may often seem to be one of the least impressive items of equipment, but its demand and performance are usually the most important factors in the design of whole plant. Depending on mode of operation, reactors are classified as batchwise or continuous. In batchwise mode, reactors are charged at the beginning of the reaction and products are removed at the end of the reaction. In continuous stirred tank reactor (CSTR), an agitator is deliberately introduced to disperse the reactants thoroughly into the reaction mixture immediately after they enter the tank. The stirred tank reactors are by their nature well suited to liquid-phase reactions. Stirred tank reactors, by virtue of their large volume, provide a long residence time. This, combined with isothermal nature of reactor, permits operation at the optimum temperature for a long reaction time. Thus, stirred tank reactors have been employed on a commercial scale mainly for liquid-phase reaction systems at low or medium pressures. Stirring tank reactor exhibits nonlinear operations where reaction is exothermic. Thus, performance prediction becomes difficult due to high degree of nonlinearity hence exact mathematical modeling is not possible. However, due to development of neural networks, it is possible to develop learning machine based on neural network model that can learn from available experimental data. Thus a system model can be constructed by estimating unknown plant parameters using neural networks (as discussed elsewhere [

Inspired from the structure of the human brain and the way it is supposed to operate, neural networks are parallel computational systems capable of solving a number of complex problems in such a diverse areas of as pattern recognition, computer vision, robotics, control and medical diagnosis, to name just few (as discussed by Haykin [

The present paper carries out neural network based identification and modeling of a typical continuous stirred tank reactor using famous neural network like focused time lag recurrent neural network (FTLR NN) with gamma memory filter. The optimal model is estimated on the basis of performance measures like MSE (Mean Square Error), NMSE (Normalized Mean Square Error),

A CSTR in Reaction Engineering laboratory of College of Engineering & Technology, Akola (Maharashtra, India) is used for experimentation.

It consists of a universal motor of 4000 rpm, 0.6 A, 220–230 V manufactured by REMI Motors, Bombay. The input/output experimental data has been obtained through rigorous experimentation carried out on CSTR in a laboratory by varying flow rate of input reactant from zero to maximum, and corresponding output (concentration of liquid) is tested for each instance. The simulation data constitutes 383 samples. In fact the process is multi-input single output, where the output variable is concentration of liquid and input variables are stirring speed, temperature, and flow rate. In this experiment, as the stirring speed and temperature are held constant to their normal values the system could be created as single input-single output (SISO). Another benchmark data for CSTR process is obtained from Internet which is contributed by Jairo ESPINOSA ESAT-SISTA KULEUVEN, Kardinaal Mercierlaan 94, B-3001, Heverlee, Belgium. Here the process is a Continuous Stirring Tank Reactor where reaction is exothermic and the concentration is controlled by regulating coolant flow. It consists of 7500 samples. (“as discussed elsewhere [

As process exhibits time relationship in the input-output mappings, versatile FTLR NN model is particularly used to describe the system behaviour. The weights are adjustable parameters of the system and they are determined from a set of examples through a process called training. The exemplars, or the training data as they are usually called, are the sets of inputs and corresponding desired outputs. When NN has been trained, the next step is to evaluate it. This is done by standard method in statistics called Independent validation. This method divides the available data into a training set and a test set. The entire data is usually randomized first. The training data is next split into two partition; the first partitions is used to update the weights in the network, and the second partition is used to assess (or cross validate) the training performance. The test data is then used to assess how well the network has generalized. The learning and generalization ability of the estimated NN based model is assessed on the basis of certain performance measures such as NMSE, correlation coefficient, and the regression ability of the NN by visual inspection of the regression characteristics for different output of the system under study (“as discussed by Narendra and Parthasarathy [

The formula for the mean squared error is

The normalized mean squared error is defined by the following formula:

The size of the mean square error (MSE) can be used to determine how well the network output fits the desired output, but it does not necessarily reflect whether the two sets of data move in the same direction. For instance, by simply scaling the network output, we can change the MSE without changing the directionality of the data. The correlation coefficient (

As there is a time structure underlying the data collected after rigorous experimentation, dynamic modeling will certainly help to improve the performance. Dynamic NNs are topologies designed to explicitly include time relationships in the input-output mappings. Time constitutes an indispensable component of the learning process. It is through the inclusion of time into operation of NN that it is enabled to follow statistical variations in nonstationary processes. Time lagged recurrent networks (TLRNs) are MLPs extended with short-term memory structures. Here, a “Static” NN (e.g., MLP) is endowed with dynamic properties (as discussed by Dudul [

The input processing elements of an MLP are replaced with a tap delay line, which is followed by an MLP NN. This topology is called focused time-delay NN (TDNN). The focused topology only includes the memory kernels connected to the input layer. This way, only past of the input is remembered. The delay line of the focused TDNN stores the past sample of the input. The combination of tap delay line and the weights that connect the tap to the PEs of the first hidden layer is simply linear combiners followed by a static nonlinearity.

Typically, a gamma short-term memory mechanism is combined with nonlinear PEs in restricted topologies called focused. Basically, the first layer of the focused TDNN is a filtering layer, with as many adaptive filters as PEs in the first hidden layer. The outputs of the linear combiners are passed through a nonlinearity (of the hidden-layer PE) and are then further processed by the subsequent layers of the MLP for system identification, where the goal is to find the weights that produce a network output that best matches the present output of the system by combining the information of the present and a predefined number of past samples (given by the size of the tap delay line).

Size of the memory layer depends on the number of past samples that are needed to describe the input characteristics in time. This number depends on the characteristics of the input and the task. This focused TDNN can still be trained with static backpropagation, provided that a desired signal is available at each time step. This is because the tap delay line at the input layer does not have any free parameters, so the only adaptive parameters are in the static feedforward path.

The memory PE receives in general many inputs

These short-term memory structures can be studied by linear adaptive filter theory if

There are basically two types of memory mechanisms: memory by delay and memory by feedback. We seek to find the most general linear delay operator (special case of the Auto Regressive Moving Average model) where the memory traces

The gamma memory PE attenuates the signals at each tap because it is a cascade of leaky integrators with the same time constant. The gamma memory PE is a special case of the generalized feedforward memory PE where

The gamma memory PE has a multiple pole that can be adaptively moved along the real Z-domain axis, that is, the gamma memory can implement only lowpass

Fully recurrent networks feedback the hidden layer to itself. Partially recurrent networks start with a fully recurrent net and add a feedforward connection that bypasses the recurrency, effectively treating the recurrent part as a state memory. These recurrent networks can have an infinite memory depth and thus find relationships through time as well as through the instantaneous input space. Most real-world data contains information in its time structure. Recurrent networks are the state of the art in nonlinear time series prediction, system identification, and temporal pattern classification.

There are four input layer structures to choose from. If your data is multidimensional, you should try a simple Axon first. If your data is a one-dimensional time series, then use one of the memory axons. There are two recurrent structures to choose from. The fully recurrent structure connects the first hidden layer to itself through a recurrent synapse connection. The partially recurrent structure adds a feedforward connection, through a synapse, from the input axon to the layer after the 1st hidden layer. In this case, the recurrent structure acts as a state for the feedforward structure.

An exhaustive and careful experimental study has been carried out to determine optimal configuration of the different NN models. All possible variations are tried to decide number of hidden layers and number of neurons in each hidden layer on the basis of performance measures. Training and testing percentages of exemplar are then varied to get optimum training-testing exemplars for each NN model. Different supervised learning rules, different transfer functions, and different transfer functions in output layer are investigated in simulation. Finally effects of different norms are tested on the model to decide optimal neural network. After meticulous examination of performance measures like MSE, NMSE, correlation coefficient, and the regression ability of the NN models on test data set, the optimal parameters are decided for the model as listed in Tables

Parameters of FTLR NN (Lab. data). Number of Exemplars: Training = 268, Testing = 115, Max epoh = 1000, Focused, Gamma, Norm

Sr. no. | Parameters | Hidden layer # 1 | Output layer |
---|---|---|---|

1 | Processing Elements | 42 | 1 |

2 | Transfer function | tan h | tan h |

3 | Learning rule | Momentum | Momentum |

4 | Step size | 0.1 | 0.1 |

5 | Momentum | 0.7 | 0.7 |

Parameters of Fully Recurrent NN (Lab. data). Number of Exemplars: Training = 306, Testing = 77, Max epoh = 1000, memory-axon, Norm

Sr. no. | Parameters | Hidden layer # 1 | Output layer |
---|---|---|---|

1 | Processing Elements | 18 | 1 |

2 | Transfer function | tan h | tan h |

3 | Learning rule | Momentum | Momentum |

4 | Step size | 0.1 | 0.1 |

5 | Momentum | 0.7 | 0.7 |

Parameters of FTLR NN (Benchmark data). Number of Exemplars: Training = 3000, Testing = 4500, Max epoh = 1000, Partially recurrent, Gamma, Norm

Sr. no. | Parameters | Hidden layer # 1 | Output layer |
---|---|---|---|

1 | Processing Elements | 39 | 1 |

2 | Transfer function | tan h | sigmoid |

3 | Learning rule | dbd | dbd |

Parameters of Fully Recurrent NN (Benchmark data). Number of Exemplars: Training = 3000, Testing = 4500, Max epoh = 1000, Partially recurrent, Gamma, Norm

Sr. no. | Parameters | Hidden layer 1 | Hidden layer 2 | Output layer |
---|---|---|---|---|

1 | Neurons | 16 | 27 | 1 |

2 | Transfer function | tan h | tan h | Linear tan h |

3 | Learning rule | dbd | dbd | dbd |

A rigorous experimental study has been undertaken in order to determine the optimal value of the gamma parameter. Again, for every variation, the network is run three times with different random weight-initialization. In computer simulation, a gamma parameter is gradually varied from 0.0 to 1.8 in the interval of 0.1 while maintaining all other parameters of the FTLRNN at their nominal default values. The results of variation of gamma parameter are graphed in Figures

Continuous stirred tank reactor.

Choice of Gamma (CSTR lab. data).

Choice of Gamma (CSTR Benchmark Data).

Figures

Comparison of NN models.

NN Models | Performance Measures on Test Data Set | |||||

CSTR lab data | Benchmark CSTR data | |||||

MSE | NMSE | MSE | NMSE | |||

FTLR NN with Gamma memory | 1.13E−05 | 0.470 | 0.9132 | 1.40E−06 | 0.00777 | 0.9961 |

Fully Recurrent NN | 8.01E−06 | 0.754 | 0.702 | 4.15E−06 | 0.0230 | 0.988 |

Desired output and Network output for testing data set (FTLR NN model-Lab. data).

Desired output and Network output for testing data set (Fully recurrent NN model-Lab. data).

Desired output and Network output for testing data set (FTLR NN model-Benchmark data).

Desired output and Network output for testing data set (Fully recurrent NN model-Benchmark data).

It is seen that FTLR NN model is capable of learning nonlinear dynamics of two CSTR processes. In this paper, it is demonstrated that FTLR NN models with gamma memory filter very closely follow desired output of CSTR processes for the testing instances. From the results presented, it is seen that FTLR NN models with gamma memory filter at 1.1 for laboratory data and at 0.5 for net data have an edge over fully recurrent NN models when performance measures and visual inspection of regression characteristics are taken into consideration. It is thus concluded that for identification of CSTR process using neural networks, FTLR NN model with gamma memory filter can be used to learn underlying highly nonlinear dynamics of the system which is the major contribution of this paper.