Management of Uncertainty by Statistical Process Control and a Genetic Tuned Fuzzy System

In food industry, bioprocesses like fermentation often are a crucial part of the manufacturing process and decisive for the final product quality. In general, they are characterized by highly nonlinear dynamics and uncertainties that make it difficult to control these processes by the use of traditional control techniques. In this context, fuzzy logic controllers offer quite a straightforward way to control processes that are affected by nonlinear behavior and uncertain process knowledge. However, in order to maintain process safety and product quality it is necessary to specify the controller performance and to tune the controller parameters. In this work, an approach is presented to establish an intelligent control system for oxidoreductive yeast propagation as a representative process biased by the aforementioned uncertainties. The presented approach is based on statistical process control and fuzzy logic feedback control. As the cognitive uncertainty among different experts about the limits that define the control performance as still acceptable may differ a lot, a data-driven design method is performed. Based upon a historic data pool statistical process corridors are derived for the controller inputs control error and change in control error.This approach follows the hypothesis that if the control performance criteria stay within predefined statistical boundaries, the final process state meets the required quality definition. In order to keep the process on its optimal growth trajectory (model based reference trajectory) a fuzzy logic controller is used that alternates the process temperature. Additionally, in order to stay within the process corridors, a genetic algorithm was applied to tune the input and output fuzzy sets of a preliminarily parameterized fuzzy controller.The presented experimental results show that the genetic tuned fuzzy controller is able to keep the process within its allowed limits. The average absolute error to the reference growth trajectory is 5.2 × 10 cells/mL. The controller proves its robustness to keep the process on the desired growth profile.


Introduction
Generally, uncertainty can be considered as a result of some information deficiency of any problem-solving situation [1].When dealing with bioprocesses under real conditions it is rarely impossible to completely avoid uncertainty.The reasons for uncertainty are quite diverse.On the one hand, there are large variations in raw material quality, especially in the food and beverage sector.On the other hand there is the intrinsic nonlinear behavior of the used microorganisms, which is in most cases still not fully understood.Therefore, existing process models are affected by incomplete or fragmentary knowledge about the underlying mechanisms.With respect to process monitoring and control, uncertainty is almost inseparable from any real-time measurement, resulting from a combination of inevitable measurement errors and resolution limits of applied sensors.And at the cognitive level, uncertainty stems from the vagueness and ambiguity which is inherent in human language and the semantics of assessment [2].Because of the fact that in most cases the sources of uncertainty cannot be easily solved from a physical point of view, several approaches are proposed in literature that allow handling uncertainties by the use of statistics.A general overview of (multivariate) statistical process control and quality control is given in [3][4][5][6][7][8] and with special focus on food by [9][10][11].With respect to online process observation and quality monitoring the use of online control charts is emphasized [12,13].The use of online control charts is a very powerful tool in decision-making.It serves as human-machine interface and thus allows the operator to evaluate the process in real time.By means of simple statistics, they allow calculating and graphically visualizing if the current process is running inside or outside its allowed limits.In order to represent the process, key performance indicators and critical quality attributes have to be defined on a univariate or multivariate basis.There are several charting techniques existing that ease the process of statistical quality control and on a single variable basis they are comprehensively reviewed by [14].However, the majority of SPC approaches presented in literature consider SPC as a pure monitoring system.Although there has been done quite interesting work making use of fuzzy logic approaches in order to handle uncertainty that is related with the construction [15][16][17][18][19][20] or the evaluation of control charts with respect to quality attribute changes [21], there is only little investigation that actually takes into account how to integrate the information that is delivered by the SPC system into a feedback control system in order to keep the process within its statistical boarders.This shortcoming is mentioned as well by Woodall, Montgomery, and Stoumbos [22][23][24][25].
With respect to automatic process control, fuzzy logic has also become a powerful tool in intelligent control of biological systems due to the capability to handle complex nonlinear processes and uncertainty in data [26][27][28][29].The concept of fuzzy logic was first introduced by Zadeh [30].It uses the principle of linguistic description by means of IF-THEN algorithms in order to mimic human reasoning and process assessment.Therefore, it is a good platform for controller design that is subjected to uncertain process behavior.
However, the classic fuzzy controller has several drawbacks.In particular, a major drawback is the lack of a learning capability.Classical fuzzy systems are static and their practical implementation and optimization is done by trial and error and based on the experience of an expert knowing the process and how it should be controlled.However, with respect to fast controller implementation and finding the optimal parameter configuration of the fuzzy sets in order to reach the required controller performance, the method of trial and error is quite cumbersome and often results in inefficient and suboptimal configurations of the control parameters.The optimal configuration can be "hidden" in the data.Therefore, in this work a genetic algorithm was used in order to provide additional intelligence and the ability of learning to the fuzzy controller.The genetic algorithm optimizes the control performance on a data-driven approach.The overall control strategy, which is represented by the rulebase, uses the cognitive knowledge of an expert.
In this approach, the process control architecture is realized by an automated feedback control system based on fuzzy logic.The fuzzy system is linked to SPC in order to control and monitor the process of yeast propagation.The developed fuzzy controller adjusts the process temperature in order to keep the process within statistical corridors of the controller input variables, which are the control error and the temporal control error derivative.Within the framework of SPC, the statistical corridors, respectively, upper and lower control limits of the input variables, are derived from historical data of batches that met the required quality specifications.Shewhart control charts (-charts) are used to calculate the ideal trajectory , the upper control (UCL) limit, and the lower control limit (LCL) of the input variables.With respect to the control quality this means that if the control error stays within the statistical borders, the process and the control meet the required and predefined quality and performance criteria with a probability of 99.73%.The adjustment of the fuzzy controller parameters is done by a genetic algorithm.The heuristic search mechanism of the genetic algorithm is able to find the ideal parameter configuration of the fuzzy sets.The advantage therefore lies in the combination of fuzzy and genetic algorithms.The fuzzy system holds the principle expert knowledge of how to best control the process and the genetic algorithm is used to optimize the expert knowledge by providing learning capability and efficient solution finding in a big search space.

Control Charts and Data
Pool.The standard Shewhart chart consists of a centerline to monitor the process mean and the upper and lower control limit which are calculated from historic process data.The control limits are usually set at ±3 times the standard deviation from the centerline, which is simply the arithmetic average.This expresses statistically that 99.73% of all batches that run within these limits are meeting the specified quality requirements and can be viewed to be in control.
The process for which the system was developed is the brewer's yeast propagation process, which is a typical and representative process biased by various sources of uncertainty.In general, yeast propagation is performed as a batch process, whereby the yeast undergoes the different growth phases of a static culture (lag phase, exponential phase, transition or deceleration phase, stationary phase, and degeneration).The individual phase duration and the transition time from one phase to another depend on various factors.For example, the lag phase depends on the physiological state of the inoculum and the specific growth medium [31].The physiological state in turn depends on storage conditions and the upstream treatment of the yeast used as inoculum [32].Furthermore, the growth behavior is influenced by the substrate, which is beer wort.Its composition again is dependent on natural variations of the used raw materials.In consequence, the effects of substrate limitations on the metabolic behavior due to unavoidable variations in available carbohydrates, nitrogen, zinc, or vitamins are subjected to uncertainty.Additionally, metabolic regulation effects occurring under brewing related conditions have to be taken into account.In this regard, the most important regulation mechanism affecting the different metabolic pathways is the Crabtree effect [33].The Crabtree effect, which is also known as overflow metabolism, catabolite repression, aerobic fermentation, or oxidoreductive metabolism, leads to the formation of ethanol at exceedance of a critical glucose concentration in the substrate [34][35][36].In summary, the process of oxidoreductive yeast propagation is affected by numerous sources of uncertainty that in consequence influence the observability and controllability of the process.Hence, in order to observe and control this kind of process an intelligent online monitoring and process control system is required.
In this work the data of 11 batches was used that met the following performance and quality requirements: (i) Cell count concentration at end of batch: ≥100 × 10 6 cells/mL.
(ii) Portion of dead cells at end of batch: ≤1%.
For the experimental work, beer wort produced from standard malt extract (Weyermann5, "Bavarian Pilsner") was used as substrate for the propagation of Saccharomyces cerevisiae sp.(strain W34/70).A detailed description of the technical plant configuration, experimental procedure, and analytics is given in [37].For the performance analysis, calculation of control charts, and the later controller design, a temperature dependent growth model by [38] was implemented.The model is based on known stoichiometric turnover and Michaelis-Menten kinetics of yeast [35,36,39].In addition, it considers growth limitations like the Crabtree effect that occur by feeding substrate sugar concentrations above 100 g/L [33].The effect of temperature on yeast growth, respectively, the substrate uptake, is modeled by implementing an additional temperature factor  temp that is expressed by a square root term that was originally developed to describe the temperature effect on the growth of specific bacteria [40,41].The specific substrate uptake   can be represented by the following equations: ( Applied half saturation constants for limitations or inhibition were   = 2.8 mmol/L [36],   = 2 mmol/L [42], and  ,eth = 500 mmol/L [43].Furthermore,  ,max = 0.486 mol/mol/h [36] denotes the maximum specific substrate uptake rate,  is the substrate concentration in mmol/L,  is the nitrogen concentration in mmol/L expressed as NH 3 equivalents, and  is the ethanol concentration in mmol/L.The lag time   is determined by a sigmoid function where  lag was set to 5.6 h. is the temperature in K and the mathematical regression coefficients were determined to be  = 0.03296 and  = 11.98 in this work. min = 270.7616K and  max = 308.1539K are temperatures where no further growth is observed.Figure 1 displays the comparison of yeast cell counts (YCC) in mmol/L between the model outputs and the corresponding experimental runs (that were judged as "good" batches from a qualitative point of view) for different temperature profiles.The YCC of the batches was measured online using a turbidity sensor (optek-Danulat, AF 16).The model has a root mean squared error (RMSE) of 7.4 mmol/L and therefore shows good accuracy in predicting the cell concentration.The error  YCC between model and real trajectory, as well as its temporal derivative ė YCC , is then calculated in order to establish the control charts: Due to the varying individual batch length, the batches were uncoupled from time.To achieve this, batch evening was performed by resampling the batches and mapping them to the shortest number of  = 9120 sampling instances.Then, after mean centering and normalization with the standard deviation, for   batches with  = 1 :  sampling instances, the control charts are calculated as follows: (3)

The Fuzzy Controller.
The applied fuzzy temperature controller is a Mamdani type controller [44,45] that consists of the standard components of fuzzification, inference engine with rulebase, and defuzzification.The fuzzification of the input variables  YCC (difference in biomass concentration between the reference process model and the real measurement) and the temporal derivative ė YCC is done via piecewise linear functions, respectively, trapezoidal fuzzy sets.In this context, the fuzzy variable  YCC is assigned to the linguistic expressions low, matched, and high.Similarly, the fuzzy variable ė YCC is linked to the verbal terms slower, matched, and faster.The fuzzy output variable comprises three fuzzy sets, namely, neg, zero, and pos.Here, the output is a temperature increment Δ that is added to the initial temperature at the start of the process.The inference engine has the task to match the input variables to the output variable of the controller by taking into account the logical statements defined in the rulebase.In this case a standard max-min method was applied [46].The rulebase contains the rules in "IF-THEN" form that determine the basic control strategy in order to follow an optimal growth trajectory delivered by the process model.The rulebase of the fuzzy temperature controller is shown as follows: At first, the fuzzy set parameterization for each variable was done uniformly across the individual universe of discourse.Therefore, the set parameters were assigned as follows: (i)  YCC : Here, the numbers denote the characteristic points of the piecewise linear membership functions used to define the individual fuzzy sets.For example, the support (the set of points on the variable domain, where the membership function value is greater than zero) and slopes of the trapezoidal fuzzy set matched are characterized by the four points −5, −0.01, 0.01, and 5.In general the membership function     (  ) of a trapezoidal set is given by [46]   U  is the universe of discourse,  denotes the leftmost point of the trapezium,  1 is the left center point,  2 is the right center point, and  represents the rightmost point.Figure 2 gives a schematic representation of the fuzzy temperature controller.
The defuzzification uses the center of gravity method (CoG) [46] in order to do the back transformation from the linguistic to the numerical domain and to calculate a crisp output value.The crisp output value of Δ is then used as an incremental change of process temperature  at the current point in time , Here, cph is equal to 360 and it denotes the cycles per hour.This results from the chosen sampling time of 10 s.

Genetic Tuning of Fuzzy Sets.
There is a wide range of bioengineering and food related applications, where fuzzy B p,q , . . ., , . . ., Figure 3: Illustrative representation and coding scheme of input and output fuzzy set parameters on a chromosome.For example,  1,1 is the first parameter of the fuzzy set low of the input variable  YCC and  1,2 is the second parameter.Following this, all parameters are coded on a chromosome, where  denotes the second input variable ė YCC and  represents the output variable.
logic controllers and expert systems have been successfully used [26][27][28][47][48][49][50][51][52][53].However, they show a deficiency in knowledge acquisition and their parameterization relies to a great extent on empirical and heuristic knowledge.Moreover, in-field tuning and performance adjustment is mostly done by trial and error, which can be very inefficient and timeconsuming depending on the complexity of the process to be controlled.The combination of evolutionary optimization methods and fuzzy logic allows incorporating information that is present in the process data in order to automatically adjust the controller parameters and to add a certain degree of intelligence.In this case, genetic algorithms play a significant role, as search techniques for handling complex spaces, and were successfully applied in many fields such as artificial intelligence, (bio)engineering, and robotics [54][55][56][57].In this work, a genetic algorithm (GA) was used in order to tune the input and output membership functions in order to make the control error stay within its statistical borders.
The genetic algorithm consists of initialization, rank-based selection, crossover, and mutation.In the beginning the settings of the GA are initialized.A population size of  = 40 individuals was chosen.The selection rate was set to 0.5 and the mutation rate was fixed to 0.02.The maximum number of iterations was set to 120.Instead of binary coding, real coding of fuzzy set parameters on the chromosomes was applied.A similar method as suggested by [58,59] was chosen to encode the fuzzy parameters.Trapezoidal fuzzy sets were used because this allows the GA to change the set form also into triangular sets as a special form of a trapezium if the two center points are allowed to take equal values.With respect to the coding scheme some restrictions have to be made in order to maintain the order of the linguistic labels.Each trapezoidal shaped membership function or label of a fuzzy variable is parameterized by a 4-tuple of real values.Therefore, an individual of the population or chromosome   is encoded as follows: In this assignment,  denotes the first fuzzy variable  YCC ,  represents ė YCC , and  is the output variable Δ.Each variable has 3 labels and each label consists of 4 characteristic points (alleles).Thus  = (1, . . ., 3) and  = (1, . . ., 4).In the beginning, the GA is initialized.For this, the first individual was fixed and the set parameters of the original fuzzy controller were encoded on the chromosome.The residual population was initialized randomly within each variables domain.However, some constraints with respect to the semantics of ordering relation and completeness have to be considered [58].In this context, the ordering of the labels was fixed and for each fuzzy set the sequence of the characteristic points was fixed in order to maintain the order of the linguistic labels.For example, in the case of , low < matched < high for label ordering and  ,1 <  ,2 ≤  ,3 <  ,4 for the sequence of set points.This boundary condition is valid for the mutation operation, as well.Figure 3 shows how the fuzzy set parameters are coded on the chromosomes.
Including a priori knowledge, the set parameters  1,1 ,  1,2 ,  3,3 ,  3,4 ,  1,1 ,  1,2 ,  3,3 ,  3,4 ,  1,1 , and  3,4 were hard coded with their initial values in order to cover the whole universe of discourse.Thus, they are not altered by crossover and mutation.The residual parameters of the trapezoidal  After initialization each individual of the population passes its parameter vector to the fuzzy controller.The controller is then simulated using the process model described in Section 2.1.Following the principle of elitism, the best solutions of each iteration are kept in order to create the next population.fuzzy sets were allowed to take values in the possible intervals of adjustment as follows:

• • •
Here, Γ represents , , and , respectively.The whole population is simulated using the growth model and the cost, respectively, fitness of each individual, is calculated using the RMSE: Here, ŷ denotes the predicted YCC by the model at the sampling point  and  is the YCC of the reference trajectory at the same point of time.Rank-based selection [60] is used in order to choose the best solutions.The best 20 individuals are chosen to form the mating pool and the pairing is done randomly within the pool.The crossover operation is done by calculating the mean of the corresponding alleles of each mating pair, which corresponds to whole arithmetic crossover [61].In this crossover method two offspring   = (ℎ  1 , . . ., ℎ   , . . ., ℎ   ) and  = 1,2 are computed from two parental chromosomes  1 = ( 1  1 ⋅ ⋅ ⋅  1  ) and  2 = ( 2 1 ⋅ ⋅ ⋅  2  ) selected to apply the crossover operator, where is a constant (uniform arithmetical crossover) and was chosen to be equal to 0.5.According to this the population is filled up again with 20 new offspring.Finally, 2% of the new population is mutated by randomly alternating one allele on a chosen chromosome.The mutation alternation is done in compliance with the restrictions of ordering.Figure 4 shows schematically the flow of information during the genetic tuning process.
The control strategy was implemented using a PLC system (Beckhoff, CX9000) with standard I/O terminals for analogue inputs and outputs (4-20 mA).On the PLC a PID temperature controller was programmed.The fuzzy system, the genetic algorithm, and the SPC monitoring system run on a separate PC in a framework similar to a SCADA (Supervisory Control and Data Acquisition) system.The fuzzy system reads from the PLC, recalculates a new set point for temperature, and writes it to the PID controller on the PLC.The communication between the PLC system and the external PC, respectively, the SCADA system, is done via Ethernet (TCP/IP).The software used for the SCADA system is in-house developed C++ based software named Virtual Expert5.

Results and Discussion
After the genetic tuning process the best individual of the simulations (RMSE = 4.03 × 10 6 mmol/L) was chosen for experimental validation.The obtained genetically tuned fuzzy sets are shown in Figure 5.The resulting set parameters are as follows: It can be observed that the slopes, shapes, and positions of the different sets have been changed.In particular, the part of the fuzzy sets linked to the linguistic value matched with a membership value equal to one has been enlarged compared to the original sets (almost triangular).This makes the controller behavior more robust against process disturbances and therefore leads to less fluctuation in temperature.
Prior to genetic tuning, the yeast propagation process was run with a nonadapted, uniformly parameterized fuzzy controller described in Section 2.2.As expected, the control performance of the fuzzy controller did not meet the requirements and exceeded the allowed control corridors.As a result, big changes in process temperature exceeding 10 K within 15 hours were recorded.As a consequence, the required performance specifications could not be met.After 2 days of propagation, less than 70 × 10 6 cells/mL and around 5% of dead cells were detected by microscopic plate count [62].The experiment was then repeated 4 times using the tuned fuzzy controller.The corresponding control charts are shown in Figure 6.As shown, the original controller exceeds the control limits, which is indicated by the arrows.In contrast, the adjusted fuzzy controller is able to keep the process within the statistical borders and therefore meets the performance requirements.Furthermore, by comparison of the controller outputs, in contrast to the nontuned fuzzy controller it alters the process temperature only when it is necessary.The original controller parameterization leads to a permanent change in temperature resulting in an oscillatory, unstable behavior.By applying the genetic tuning process this behavior could be avoided leading to a smoother change of the temperature.Using the tuned fuzzy controller, on average, a cell concentration of about 185 × 10 6 cells/mL and less than 1% of dead cells were achieved after two days of cultivation.The RMSE of reference trajectory and online measured YCC is 5.2 × 10 6 cells/mL.This shows that there is a good matching and that the fuzzy controller is able to lead the process along the desired growth profile.The control charts are projected online; thus the user is permanently informed if the process was in control or if there was any deterioration occurring.However, it has to be noted that the immediate and specific identification of the cause for undesired process behavior would need some additional process knowledge and experience.Here, the quality of the process is merely linked to the control performance.So, if there was, for example, contamination with another microorganism or an undersupply with oxygen, the process would go out of the corridors and one could directly observe this in the control charts, but one would not be able to tell the reason for that without having the corresponding experience and process knowledge.Therefore, a multivariate approach in combination with recent fuzzy control chart evaluation methods [21] is currently under investigation in order to link further quality attributes with the corresponding key performance indicators of the process.This would be a further step in combining online (multivariate) statistical process monitoring and direct, intelligent feedback process control techniques.

Conclusion
In this work an approach is presented to couple statistical process monitoring with an intelligent feedback control based The solid bold lines (-) are the statistical values of the upper control limit, mean, and lower control limit.The dotted (⋅ ⋅ ⋅ ) curve is the trajectory that belongs to the nontuned fuzzy controller.The arrows mark where it leaves the statistical corridor.The residual lines (-⋅ -) stay clearly within the control limits and show the error trajectories resulting from the genetic tuned fuzzy controller.Graph (c) shows the output of the nontuned fuzzy controller (⋅ ⋅ ⋅ ) in comparison to the output of the genetic tuned controller (---).
on fuzzy logic for handling uncertainty biased processes related to food production and fermentative processes in life sciences.The system is demonstrated by the process of brewer's yeast propagation.For that purpose, the fuzzy controller parameters are adjusted using a genetic tuning algorithm in order to meet the required quality and performance criteria.Subsequent to the simulations, an experimental verification was performed using a 120 L mediumscale propagation system.The obtained results show that the performance of the control system is directly linked to process quality.By staying within the statistical control limits, the required biomass concentration of 100 × 10 6 cells/mL was exceeded reaching up to 185 × 10 6 cells/mL, whereby the RMSE to the reference growth trajectory is 5.2 × 10 6 cells/mL.However, the remaining future challenge is to specifically identify the cause of a process anomaly without having the corresponding experience or knowledge about the process.Therefore, current investigations strive for a combined approach of multivariate modeling and fuzzy control chart evaluation to link specific quality attributes and the control performance of the process, which would be a step forward in combining online (multivariate) statistical process monitoring and direct, intelligent feedback process control techniques.

Figure 1 :
Figure 1: Comparison of cell count (YCC) of historical batches and model output.The circles (o) show the YCC model output in mmol/L, the solid line (-) is the experimental trajectory of YCC in mmol/L, and the dashed line (---) denotes the temperature in K.The duration, respectively, batch length, varied between 24 and 46 hours.Initial conditions of yeast cell concentration varied in between 4.1 mmol/L and 14.9 mmol/L.The final state of the yeast cell concentration showed a range from 146.3 mmol/L to 191.9 mmol/L.The temperature varied from 283 K to 291 K.
e YCC is lowIF e YCC is low IF e YCC is matched IF e YCC is matched IF e YCC is matched IF e YCC is high IF e YCC is high IF e YCC is high

Figure 2 :
Figure 2: Schematic representation of the nontuned fuzzy controller structure and the flow of information.The first step shows the fuzzy partition and the fuzzification of the input variables  YCC and ė YCC into their specific fuzzy sets.The second step diagrammatically shows the inference mechanism and the activation of the corresponding rules in the rulebase.The third step illustrates the set partition of the fuzzy output variable Δ and the accumulation and COG defuzzification of the overall implied fuzzy set into a crisp output.

Figure 4 :
Figure4: After initialization each individual of the population passes its parameter vector to the fuzzy controller.The controller is then simulated using the process model described in Section 2.1.Following the principle of elitism, the best solutions of each iteration are kept in order to create the next population.

Figure 5 :
Figure5: New set configuration of the fuzzy controller after the genetic tuning process.Slopes, shapes, and positions of the sets were changed.In particular, the sets denoted as matched were generally enlarged.

Figure 6 :
Figure 6: Online control charts for control error  YCC (a) and its derivative ė YCC (b).The solid bold lines (-) are the statistical values of the upper control limit, mean, and lower control limit.The dotted (⋅ ⋅ ⋅ ) curve is the trajectory that belongs to the nontuned fuzzy controller.The arrows mark where it leaves the statistical corridor.The residual lines (-⋅ -) stay clearly within the control limits and show the error trajectories resulting from the genetic tuned fuzzy controller.Graph (c) shows the output of the nontuned fuzzy controller (⋅ ⋅ ⋅ ) in comparison to the output of the genetic tuned controller (---).