Long-Term Prediction of Biological Wastewater Treatment Process Behavior via Wiener-Laguerre Network Model

1 Environmental Technology Division, School of Industrial Technology, Universiti Sains Malaysia, Penang 11800, Malaysia 2 Faculty of Science, University of Guilan, Rasht, Guilan 41938-33697, Iran 3Department of Mechanical Engineering, Faculty of Engineering, University of Guilan, Rasht, Guilan 41996-13769, Iran 4The Academic Centre for Education, Culture and Research (ACECR), Institute for Environmental Research, Rasht, Guilan 41365-3114, Iran


Introduction
The reactive dye-containing effluents from dye manufacturing and application industries can cause serious environment pollution due to the toxicity and slow degradation of dyes [1].In addition, the presence of dyes in water is highly visible and affects water transparency and aesthetics even in low concentrations.Therefore, the effluents must be treated before being released into the environment.
In recent years, researchers have shown interests in biological treatment of wastewaters with high concentrations of dyes [2,3].Treatment of these polluted wastewaters requires high effectiveness and low cost dye removal processes [4].Sequencing batch reactor (SBR) is a promising biological system for treating dye-containing wastewaters [5,6].This system is cost efficient and flexible to handle different feed characteristics.Furthermore, its operation is easier than other biological methods [1].However, the SBR process, like other biological processes, is highly nonlinear, time varying, and subject to significant disturbances [7].Modeling the treatment process can provide better understanding, design, operation, and control of the process [8].
The ability of artificial neural networks (ANNs) in blackbox modeling of nonlinear systems with complicated structure has made them the most popular tool for modeling of biological processes [9].In recent years, recurrent neural networks are developed based on common NARX (Nonlinear AutoRegressive eXogenous model) topology in order to characterize the behaviors of dynamic systems.In such models, the past terms of input and output signals are used as the network inputs [10].In literature, different applications of NARX networks have been reported for monitoring [11,12], controlling [13], optimization [14], and simulation [15,16] of wastewater treatment process plants.In spite of good performances of NARX networks, they are still suffering from some drawbacks such as the need for prior terms information of input and output variables.Furthermore, the presence of output feedback may cause prediction errors to deteriorate the model performances [10].
Wiener-type models are a class of block-oriented representation of nonlinear systems, where a dynamic linear part is cascaded by a static nonlinear part [17].In recent years, Wiener-type models have been successfully applied to identify nonlinear dynamics in biological processes [18][19][20].Employing standard multilayer perceptron (MLP) networks as nonlinear mapping function in Wiener models structure is proposed as an alternative solution to develop nonlinear models [21].The linear part of Wiener models is often chosen to be as formal autoregressive with exogenous input (ARX) models.In this case, in order to attain high performance, a large amount of training data for estimating the order of model, time delays, and parameters would be required.A big obstacle is the high cost of preparing training data, where special instruments and data acquisition equipment for online system monitoring are required.
A combination of orthogonal functions as the linear dynamic part and artificial neural networks as the memoryless nonlinear part of Wiener models was proposed to deal with this problem [22,23].Laguerre functions are the most common orthogonal functions used in Wiener model structure.Simple structure and capability to give accurate description of systems with rather small number of tunable parameters are the reasons they have been widely employed for linear and nonlinear system identification [24].
In this study, a Wiener-Laguerre model is employed to describe the nonlinear behavior of an SBR used for dye wastewater treatment.The proposed method has not been previously used for biological treatment modeling of dye-containing wastewaters.An aerobic bacterium, Sphingomonas paucimobilis, was used for decolorizing of the effluent containing a reactive textile azo dye, Cibacron yellow FN-2R.S. paucimobilis has been shown previously to be efficient for degradation of azo and triphenylmethane dyes [25,26].

Material and Methods
2.1.Microorganism.Sphingomonas paucimobilis was isolated from a closed drainage system located in Perai industrial area, Perai, Penang, Malaysia, where several heavy industries were located.The microorganism was grown on slant agar at 35 ∘ C for 2 days under static conditions.

Media Composition.
A synthetic dye-containing wastewater was used in this study.Seed culture medium consisted of distilled water 1000 mL, powdered Cibacron yellow FN-2R as the main carbon source, urea as the nitrogen source, and K 2 HPO 4 and KH 2 PO 4 as the phosphorous sources and the nutrients.The composition of the wastewater gave a C/N/P ratio of approximately 100/10/1 by adding NH 4 Cl and KH 2 PO 4 .The initial pH was adjusted to 7 using NaOH.

Preparation of the Synthetic Dye
Wastewater.Cibacron yellow FN-2R is one of the main reactive dyes that are used in textile industries.Reactive Cibacron FN-2R was purchased from Sigma-Aldrich.The dye solution of various concentrations 250, 500, 1000, and 1500 mg/L were treated in four reactors with successively increasing of dye concentration and were analyzed for COD, BOD, and MLVSS everyday according to the standard methods [27].

Experimental Setup.
In this study, four cylindrical Plexiglass reactors with 14 cm diameter and 46 cm height were used.The working volume and influent flow rate were 1.6 L and 3.0 L/d, respectively.Four air pumps and four mixers were used for continuous aeration and mixing.The speed of impeller was adjusted to 300 rpm to maintain the dissolved oxygen concentration at DO > 2 ppm.After acclimatization period, different concentrations of the dye and 2 mL of bacterium (S. paucimobilis) were added to each reactor.The pH of the mixture was adjusted to 7 after stirring at a constant temperature (35 ∘ C) for a certain period.

Input-Output Selection.
In order to develop accurate and reliable models, appropriate data sets should be collected from the variables, which are governing the behavior of the process.A set of variables including the influent COD, MLVSS, and reaction time as the process inputs and the effluent COD and BOD as the process output were prepared.Two sets of data including 147 data points (in 147 days) and 90 data points (in 90 days) are chosen for training and validation of the models, respectively.As a result, the input/output vectors would be as For increasing the performance of modeling process, the recorded data are normalized in the range of [0, 1] by dividing on the maximum value of each datum column.

Model Structure and System Identification
Based on the obtained experimental data, appropriate nonlinear models were developed to predict the concentrations of COD and BOD in the effluent.
Wiener-type model is a well-known block-oriented representation of nonlinear systems, in which a linear dynamic part is cascaded with static nonlinearities.As it is shown in Figure 1, the sequences of input variable () are mapped into the intermediate variable V() through the transfer function (), where the model output () is estimated throughout the nonlinear mapping function (⋅).Many different options are available for the linear part and nonlinear mapping functions.
Here, the considered Wiener model is a combination of Laguerre basis filters as the linear part and a feed-forward NN for nonlinear mapping.In this case, without loss of generality by considering a SISO (single-input and singleoutput) system, the nonlinear dynamics can be represented as a discrete-time nonlinear mapping on filtered input terms, where   () denotes the Laguerre filters.This structure is a NN realization of the Wiener-Laguerre model [21].
As it is presented in (1), three inputs are chosen as the dominant variables on the effluent COD and BOD.In this case, each process model can be considered as a multi-input single-output (MISO) model, as it is presented in Figure 2.
3.1.Laguerre Filters.Laguerre functions can be represented as a set of discrete-time transfer functions in z-domain, where  ∈ {R : || < 1} is the pole parameter.The dominant pole "" determines the rate of exponential decay for Laguerre functions responses, which can be captured through optimization or by experiments [28].The first section of   (, ) is a first-order low-pass filter, which is followed by ( − 1) identical all-pass filters.The orthonormal bases are formed by a set of Laguerre functions on  2 [0,∞), which is defined on z-domain.Consider For physical systems, the limited order of Laguerre stages can be employed to approximate system dynamics.In this case, for a given truncated order ", " the transfer function () for a SISO linear discrete-time system is obtained as follows: in which () and () are input and output of the system and   () are the Laguerre coefficients [28].By considering  = 0,   (, ) will turn to regular delay operators and () to the usual FIR (finite input response) model.Introducing   as the output of Laguerre filters,   =   (, )  ()  = 0, 1, 2, . . ., .Thus, for a linear system, the output of an "a-order" Laguerre network is obtained as the linear combination of the Laguerre filters output as The linear combination of the Laguerre filters' output can be replaced by a multilayer perceptrons (MLP) as the nonlinear mapping function in order to deal with nonlinearities.In order to increase the performances of Laguerre networks, it is required that the pole parameters and the order of the Laguerre filters have to be estimated.Developing a first order linear model for the process could be helpful to find appropriate process time constant.For this aim, the influent and the effluent CODs can be considered as the model input and output signals.We have in which   and   are the process gain and time constant.
The parameters of linear model can be captured using the MATLAB System Identification Toolbox (Ver.7.4.2).The best possible result for the linear system pole was  = −1/  = −0.82,as the model output fit on the real data was about 52.1%.The equivalent discrete pole parameter can be captured by estimating  = exp() in the z-domain, where the sampling time  is equal to 1.This results in  = 0.44.The step responses of Laguerre filters (normalized) are shown in Figure 3.It is noted that by increasing the order of Laguerre filters, the structure of the NN part would be more complicated and therefore computational efforts for the nonlinear part training would increase.A major advantage of Laguerre network is its ability to describe the behavior of nonlinear systems by high-truncated order models.Choosing the order of filter  = 8 is an appropriate tradeoff between the complexity and the accuracy of models.The pole parameters and filter order are chosen to be 0.44 and 9, respectively, for all input variables.

Neural Network.
A multilayer feed-forward neural network (FFNN) is considered as the nonlinear part of the proposed models.A FFNN is developed by a number of cascaded layers that are interconnected by weight coefficients to the neighboring layers nodes.Neurons are the fundamental part of NN and responsible for information processing, which are consisting of weighting coefficients, activation functions, and biases.The structure of FFNN is presented in Figure 4.It is noted that, in this work, no bias connection was considered.
Generally, a FFNN may have one or more hidden layers; however, a network with a single hidden layer could approximate many systems with an acceptable degree of accuracy.In many cases, choosing the appropriate number of neurons in the hidden layer and the type of the activation functions is known as the dominant parameters on the model performances.The number of neurons in the input/output layers is chosen with respect to the number of input/output variables of the investigated system, which are 27 and 1 for input and output layers, respectively.For a single hiddenlayer NN, it is recommended that number of hidden neurons ( ℎ ) be chosen with respect to the geometric pyramid rule proposed as where   and   are the number of network inputs and outputs, respectively, and  is multiplication factor.The value of  should be selected in the range of 0.5 <  < 2 depends on complexity of system [29].As a result, by considering  = 1.8, the number of neurons in the hidden layer  ℎ would be equal to 10.The best activation functions may often be selected through trial and error.For the hidden layer, activation functions are chosen to be hyperbolic tangent sigmoid function and the output layer chosen as linear transfer function.
In order to adjust the parameters of NN, the second-order derivative-based Levenberg-Marquardt (LM) algorithm was employed.The LM is an accelerated neural networks training algorithm, which is rather used for adjusting the parameters of moderate-sized MLP.The model weight parameters (W) are tuned through iteration by the following equation: where  is the Jacobian matrix that is captured by first derivation of network errors with respect to its parameters and  is some nonnegative value known as the learning

Input layer
Hidden layer Output layer . . . . . .parameter [30].The objective function for training process was considered as mean square of error (MSE), where  *  and   are the target and actual outputs for the th pattern, respectively, and  is the total number of training patterns.The error vector  is calculated as follows: The MATLAB Neural Network Toolbox (ver.7.0.1)was employed to perform the process of model training.

Simulations and Results
The proposed modeling approach was applied to the experimental data.The performance of the models was measured by the coefficient of determination ( 2 ) and mean absolute error (MAE) between the predicted values of the model and  the experimental values, which were calculated by ( 13) and ( 14), respectively.Consider where  is the average of  over the  data, and  *  and  ()  are the th target and predicted responses, respectively.In Figure 5(a), the responses of the developed model for prediction of BOD are presented.In addition, the correlation between the experimental and predicted values is presented in Figure 5(b).Figure 6(a) shows the responses of the developed model for prediction of COD, whereas the correlation between the experimental and predicted values of COD are depicted in Figure 6(b).
The results indicate the accuracy and ability of the proposed model to predict the selected parameters perfectly.
The prediction accuracy of the developed models was also evaluated by performing a comparison between the responses of the models and the actual values.For this propose, the error

Conclusion
The present study aimed to explore the potential of the Wiener-Laguerre network model in estimation of BOD and COD of the output stream of a SBR system.A multilayer feedforward neural network was considered as the nonlinear part of the proposed model.The performance of the model was found to be reasonably good with a high  2 (0.99) and a small deviation between the predicted and experimental values.The proposed model can be used as a flexible alternative to the first-order models commonly used for long-term prediction of BOD and COD parameters.

Figure 3 :
Figure 3: The step responses of Laguerre bases filters.

Figure 4 :
Figure 4: The structure of feed-forward neural network.

Figure 5 :
Figure 5: The responses of the developed model (a).Correlation between the actual values and the values predicted by the model for BOD (b).

Figure 6 :
Figure 6: The responses of the developed model (a), Correlation between the actual values and the values predicted by the model for COD (b).

Table 1 :
Error functions for the developed model predicting BOD and COD parameters.werecalculated,where the error is defined as the difference between the predicted values by the models and the experimental data.Here, the upper bound error (Max(||)), lower bound error (Min(||)), MAE, and  2 are calculated and presented in Table1.The results show a small deviation between the models predicted values and the experimental data.A significantly high  2 and a low MAE obtained indicate the superior data fitting and prediction capability of the developed model.BOD and COD are two major parameters for examining the quality of discharged wastewater.Their measurements need significant time and commitment to make proper adjustments in the wastewater treatment process.The model proposed has the ability to provide good generalization performance in capturing nonlinear relationships between 6 International Journal of Chemical Engineering parameters to estimate COD and BOD of the treated wastewater. functions