A Methodology for Optimization in Multistage Industrial Processes : A Pilot Study

The paper introduces a methodology for optimization in multistage industrial processes with multiple quality criteria. Two ways of formulation of optimization problem and four different approaches to solve the problem are considered. Proposed methodologies were tested first on a virtual process described by benchmark functions and next were applied in optimization of multistage lead refining process.


Introduction
The main aim of our research is development of an optimization strategy, appropriate for optimization tasks based on multistage and multithread chain structure (i.e., represented by a linear order or more generally an acyclic graph), where many intermediate target functions are combined in a nontrivial way to give optimal quality of the final product.The inspiration for our considerations comes from the former authors' research on modelling of complex production chains seen in metallurgical industry, for example in the oxidizing roasting process of zinc sulphide concentrates [1] or metal forming processes [2,3].
To serve production purposes, such lines are usually composed of many aggregates or intermediate stages, where the output (semiproduct) of one intermediate stage becomes the input of a consecutive production stage and where specific needs of production are imposed at any production stage.In most cases the production chains have linear structure (Figure 1) or a structure of a tree (Figure 2).Each node represents a certain stage of production, and vector x  represents a semifinished product created at that stage, while vector p  consists of control parameters.Quality checks or intermediate goals   can be applied at all or chosen production stages.
Optimization of such cycle is not trivial, which will be shown in the work.The most common research papers devoted to the topic of optimization in industrial production usually treat one particular stage of this production, trying to develop an optimal control strategy which leads to obtain the best possible semiproduct.This strategy, while definitely successful for optimal production at this particular stage, is not sufficient for reaching the best possible final products as the result of the whole production cycle.Simply, technological constraints at one stage rarely can contain all the information on constraints and goals on consecutive stages of the chain (they are too complex to be defined in a single criterion and are too dispersed to be quantified).
In practice, it is usually possible to increase efficiency˜of the process as a whole by a global look at all stages of production.For example, it may be worthy to decrease quality of semiproduct (e.g., by decreasing the energy consumption or amount of necessary chemical reagents at further stages) maintaining sufficient quality of the final product.Such a possibility is disabled when we perform sequential optimization, since we do not have sufficient insight into 2 Mathematical Problems in Engineering x 2 = F 1 (x 1 , p 1 ) x 3 = F 2 (x 2 , p 2 ) x n p n x n + 1 x n+1 = F n (x n , p n ) x m p m x m + 1 x m+1 = F m (x m , p m ) Figure 1: Production chain of linear structure.q 1 q 2 q 3 q 4 q 6 q 5 x 1 x 2 x 3 x 3 x 4 x 7 x 5 x 2 x 6 p 1 p 2 p 4 p 3 p 6 p 5 x 2 = F 1 (x 1 , p 1 ) x 3 = F 2 (x 2 , p 2 ) x 5 = F 4 (x 3 , p 4 ) x 4 = F 3 (x 3 , p 3 ) x 7 = F 6 (x 6 , p 6 ) x 6 = F 5 (x 2 , p 5 ) interactions between particular stages of the whole chain.Then, it is reasonable to investigate and to develop optimization methodology applicable to the chain as a whole, obtaining that way new possibilities of optimization of industrial processes.
The main goal of the present research is solving of the problems with optimization objectives built on the basis of chains (acyclic graphs) with internal interrelations between separate stages.In the simplest case it will be a linearly ordered sequence of intermediate stages, similar to that presented in Figure 1.Each stage receives input (supply) or semiproduct x  and control parameters p  and returns, as a result, another semiproduct x +1 with quality value   .Each incoming parameter (x  or/and p  ) as well as parameters of outgoing product x +1 and its quality value   can be limited by numerous constraints, induced by technological restrictions, admissible values, characteristics of the process, and so forth.
Each separate stage of hypothetical industrial process (e.g., metallurgical process) can be described in terms of input and control parameters (distinction between x and p is motivated by the fact that we can control some technological parameters, while we cannot control parameters of supplies or semiproducts that we receive at the beginning of this stage; e.g., we cannot control on-line chemical or technological parameters of supply).The parameters of semiproduct for stage  + 1 are a result of the process at stage , which can be written by mathematical formula: with some additional constraints motivated by technological restrictions.Hence, optimization of -stage linear chain production can be presented as optimization of the following composition of functions, where function at stage  + 1 depends on all previous functions: Let us observe that there are numerous problems that arise immediately.First of all, entry data x  at each stage can be very specific, carrying lots of technological or practical restrictions (constraints of optimization).In practice it means that in many cases we will be able to calculate value of the final quality function   only for some restricted set of parameters x  or p  .As a consequence, it is really hard to speak about global objective function, since for majority of parameters it cannot be properly defined.Additionally, intermediate quality functions   can be important for the assessment of the quality of the whole process.These parameters may not be visible "in sensu stricto" in the final quality function   ; however, they can have an important impact on the whole process and its optimality.Therefore, even in the simplest case of the linear production chain, a list of natural questions arises, and answers to these questions are of the first importance for the optimization procedure.
(1) How to properly synchronize parameters of semiproducts x  at stage  with constraints of the function  +1 ?
(2) How to build an optimization strategy to achieve satisfactory effects (intermediate quality criteria) at each stage and at the end of the chain when the overall quality function of the resulting product is calculated?
(3) Is it better to apply sequential treatment of the problem at some stages, or is it better to search for a more global strategy (when possible)?
A search for answers to the above questions or more adequately search for appropriate methodology that can be applied in the context of multiobjectives multithreaded optimization problems is the main objective of the present research.
The paper presents the preliminary results of research on the optimization methodology of the production chains.The results are limited to the linear structure of production chain (Figure 1).The quality criteria will be applied at the intermediate stages, as well as at the very end of production (quality requirements of the final product).Results of the optimization of the hypothetical production chains described by the benchmark functions are presented first.Also the results of the validation of the proposed optimization strategy on the real, industrial data originated from the initial stages of the lead refining production cycle are presented.

Formulation of the Problem
Assume that we have a technological process consisting of  aggregates interconnected in the linear structure (see Figure 1).We assume that each stage, say with number , demands, as an input, a vector x  of semiproducts from previous step (we also have x 1 which can be a semiproduct from previous production cycle, etc.) and a vector p  of controlling parameters (optimization design variables).We have a direct access to p  which can be freely modified in this step (in admissible range, of course), while we have only a small influence on x  , which can be modified by change in control parameters on previous stages  < .Clearly, in practice, we do not have complete freedom in choice of values of control vector, neither we can accept all resources delivered to our stage (compound of industrial process).Simply, we are always restricted by production demands, ranges of parameters that can ensure proper execution of technological process, and so forth.To cover all these restrictions we will assume that at each stage we have the following constraints: where set   aggregates all constraints.In most practical applications   will be simply a collection of inequalities on respective coordinates.However, in general, we cannot ensure that there will be no interactions between admissible values of all vectors; for example, admissible range of some coordinates in x  can determine range of some parameters (coordinates in p  ), and so forth.
To describe effect of the production process at stage  we will use two functions   ,   which define Here x +1 is a vector of parameters connected with semiproduct obtained as a result of production at stage , p  is a vector of design parameters, and   determines quality of that production.For example, it can be a quality of the semiproduct, or quality of production itself (e.g., usage of electricity and amount of wastes).For simplicity, we do not assume that quality   is a vector; however, it is obviously possible to have various measures related to a single stage in the production chain.If it is the case, we assume that   is a result of multiple quality objectives at that step calculated by the use of one of the methods presented later in Section 3.
As the result of our production chain we will get the final product x +1 obtained as the result of production in the last, th aggregate.The final quality of this product is calculated as We use the special symbols   ,   instead of  +1 ,  +1 to highlight the special, most important role of the final quality check.

Optimization Techniques
Before we proceed to a more complete description of research objectives, let us present a few classical methodologies for solving multiobjectives problems.Even at the level of single element of the whole structure (described by an acyclic graph) it may happen that we need to optimize the parameters against a few (sometimes contradictory) criteria.For the completeness of our considerations we present a standard form of multiobjective optimization task: (C1) find minimum of functions:  1 ,  2 , . . .,   ,   (C2) under sets: where 1 ≤ ,, ≤ ,  1 ,  2 , . . .,   are intermediate objective functions;   -objective function related with final product; x-variables; , , , ℎ-constraints.Note that condition (C1) consists of a few functions (criteria of utility or quality); hence, it is impossible to find a unique optimal solution suitable for all of them.This leads us to the notion of Pareto-optimal solution, that is a solution which is admissible and such that any other admissible solution is worse than that in at least one criterion in (C1); for example, see [4,5].
There are several techniques to deal with multiobjectives tasks.Below we will recall a few standard techniques which we are going to use later on, when performing tests on concrete examples.
Weighted Sum Method.It is one of the simplest methods to apply.We simply assign weight (of importance) to each quality criterion and that way we build an aggregated function, measuring kind of average quality in the process.More formally, we consider the objective function, where values are computed by formula Clearly, we can assume that all the intermediate quality functions values   as well as   are going to be minimized, so minimizing of  have a chance to lead to a proper solution [5][6][7][8].
Weighted Metric Method.It is a generalization of weighted sum method.The main difference is that instead of minimizing intermediate weights, we minimize their distance to desired optimal solution: We also have freedom to choose the metric, but the most standard is metric given by   -norm with 1 ≤  < ∞.Strictly speaking, we have to minimize the following formula: This method is more demanding in practical applications than weighted sum method.The main difficulty is that we should provide a good candidate for reference solution  * .When we do not know bounds for values of intermediate quality functions   ,   it may be very hard to estimate proper values of  * .
Another metric, which defines so-called Tchebycheff method, is metric given by  ∞ norm.With this norm (9) changes to Method of -Constraints.In this method we replace weights by specific constraints on all but one intermediate quality function (objectives of optimization).In other words, we look for max   under constraints   <   <   +  for all .That way we have to optimize value of only one quality function, keeping all the other functions within properly defined bounds (-constraints).This method can be especially useful when we know bounds for "good" quality and can accept solutions in that interval.Obviously, main difficulty here is proper choice of constraints (i.e., bounding parameters   and ), which in practice, when our knowledge about problem is limited, can be a challenging task.

Optimization Strategies of Solving Multistage Problems
Elaboration of the optimization strategy of solving multistage processes is much more complex than in case of singleobjective or even multiobjective tasks.To define a strategy, it is necessary to: (1) formulate the optimization problem, (2) choose the approach to solve the problem, (3) choose the optimization method.
All these steps are described in following sections.
Alternative for the global formulation is a local () one.In the local formulation of optimization problem, the objective function is any combination (described in Section 3) of the qualities of semiproducts.The quality of the final product is not taken into account.In this case, the objective function is described by equation: We assume that ensuring the optimal semiproducts we gain the optimal final product.

Approach to Solve the Optimization Problem.
The realization of the optimization procedure to solve the problem formulated in one of described ways can be performed using different approaches.In this paper we introduce four approaches: (1) simultaneous (SIM), (2) sequential (SEQ), (3) simultaneous with freezing (SIMF).

The Simultaneous Approach.
In a simultaneous approach the optimization procedure searches the optimal solution for all stages at once.In SIM approach we look for the vector composed of control vectors at all stages (p 1 , . . ., p  ).However, in practice, it is not that simple.Some problems connected with multidimensionality may arise.Secondly, we have to ensure that each constraint (3) is satisfied for  = 1, . . ., .Otherwise, the function   may not be defined.If set   of admissible parameters of stage  is very narrow for each , then finding a set of admissible values of initial vector may be almost impossible.The optimization procedure runs only once, but the number of decision variables is greater than in SEQ or SEQC cases.

The Sequential Approach.
In the sequential approach (SEQ), the optimization of each stage of the chain process (see Figure 1) is being performed separately.For the stage  = 1 we look for the value p 1 that leads to the optimal value of the quality criterion  1 under additional condition (x 1 , p 1 ) ∈  1 .Next, the output x 2 (semiproduct of that stage) is transferred to the subsequent stage  = 2 as the input signal and now we search for the values p 2 such that quality criterion  2 is optimal under constrains (x 2 , p 2 ) ∈  2 .
That way we should reach the last th aggregate successfully generating the final product x +1 , and consequently be able to check its quality   .Optimization at each stage ends when the value of quality function is lower than assumed accuracy  or when number of maximum objective function calls  max is reached.This approach, which definitely should lead to some solution, has one disadvantage.We have no much control of the final value   during the computation procedure of subsequent optimizations of th stage, despite the fact that   strongly depends on the preceding p  which optimizes the value of q  (not necessarily the value of q  ).On the other hand, its advantage is that dividing the optimization into steps makes the whole process much faster.Presented approach can be used as a starting point for another, more advanced algorithm.

The Simultaneous Approach with
Freezing.Optimization of all stages at once may cause deterioration of individual quality functions, because the optimization procedure will not stop if one of quality functions reaches a given accuracy limit .To avoid this situation the control parameters for stages, at which quality functions are satisfactory low, are fixed and they are excluded from of decision variable.Fixing of control parameters at certain stage is possible only if control parameters for all previous stages are already fixed.

The Sequential Approach with Credits.
The sequential approach with credits (SEQC) is a modification of the described above sequential one.The difference between SEQ and SEQC approaches lies in the number of maximum objective function calls at each stage.In the SEQ approach that number is the same and equals  max at each stage.However, during optimization of whole process, the total number of available objective function calls is equal to  ⋅  max .In SEQC approach the total number of objective function calls is not divided equally to all stages.During optimization of first stage algorithm allows us to use any of the total number of function calls.In optimization of subsequent stages the total number of function calls is reduced by the number of calls already used in preceding stages.Such strategy allows us to "borrow" some objective function calls from next stage and in as a consequence, total number of the objective function calls can exceed  max during optimization for some stages.

Optimization Method.
To have a chance to find a proper solution it seems important to use algorithms which allow entire-space search.There are numerous possible choices of methods, most of them using heuristics motivated by nature.We decided to employ particle swarm optimization (PSO) in performed tests of developed optimization methodologies.A search space of all performed tests remains a multidimensional cube, which simplifies search for a starting point (we simply pick up value of coordinates at random form specified intervals).The PSO method is based on the mechanisms observed in the nature [9][10][11].This method is based on the behaviour of the individuals in a population.Particles (identified with the solutions of the problem considered) traverse the decision space (the area inhabited by the population) following the particle representing the best hitherto behaviour, at the same time remembering the best position, in which they have been so far.Each particle is described by two vectors: the position vector and velocity vector.In each iteration of the algorithm, a new velocity vector is determined and the change of the particle position occurs based on it.
The swarm initialisation consists of giving the particles a random position and velocity.The position should be sampled from the permissible area.The size of this area should be considered when sampling the velocity.If the velocity is too low, the swarm will not be able to search the entire permissible area, while excessively high velocity makes the particles "bumps" against the limits.The velocity vector changes according to the relationship: where x   and k   are the position and velocity vectors of the th particle in the th iteration, respectively; p  denotes vector of the best position found so far by the whole swarm; vector p  represents the best solution found so far by the th particle;  is defined as the inertia coefficient;  1 and  2 are acceleration coefficients (called also training coefficients);  1 and  2 are numbers from the interval [0, 1] picked at random with the uniform distribution.A new position the particle is defined by After displacement of all particles to their new position, they are subjected to an assessment and the swarm leader is chosen and vectors p  and p  are updated.The determination of the coefficients values affects the swarm behaviour.The value of the inertia coefficient is usually selected from the [0, 1] interval.A higher value is favourable for the global searching of the solution space and a lower value for the local searching.Usually, its value is constant throughout the entire optimization process.However, it also may change.
Then, at the beginning, it assumes a high value, enabling global searching, and while approaching to the maximum that is sought, it gradually decreases.Acceleration coefficients are usually equal and selected from the [0, 2] interval.When selecting their values, the maximum velocities, which the particles should not exceed, must be considered.The exceeding of the maximum number of iterations or obtaining a satisfactory solution is taken as the criteria of the computation completion (stop criteria).

Optimization of Virtual Multistage Process
To compare methodologies described in Section 4 we start with simple virtual multistage production chains.Considered chains have the linear structure consisting of  stages, where in our simulations  ranges between 2 and 10.The first considered chain is evaluated by the following square function representing a quality function (same for all stages): while every stages of the second considered chain is characterized by the quality function defined by the Rastrigin function: In the above formulas all vectors are -dimensional, that is, x  , p  ∈ [−1, 1]  , where in our simulations, the number of dimensions  ranges between 1 and 10.
In our virtual process, we use simple relation for semiproducts delivered to next stage by putting: x +1 = p  .In other words, position of global minimum at x * = 0 is moved to p  in stage  + 1 (see Figure 3) and its position at initial stage depends on initial input signal vector x 1 which we set for simplicity to be x 1 = [0, 0, . . ., 0].
We do not define the final quality function; therefore, the problem was formulated in local way only.In simultaneous approach SIM, the objective function was defined as follows: It ensures that optimization ends successfully only when all values of quality functions are better than assumed accuracy .
The benchmark functions (15) and (16) were specially selected to test proposed strategies in optimization of unimodal and multimodal objective function.Note that the number of local minima of function (16) grows exponentially with , so there are exactly 7  local minima for each function   .
The Particle Swarm Optimization technique (described in Section 4.3) was applied.The swarm consisted of 40 particles.The inertia coefficient was set to 0.8 and the acceleration coefficients were equal to 1.The number of allowed evaluations of quality function in each stage was limited to  max = 1000 in case of square function and to  max = 2000 in case of Rastrigin function, while the required accuracy was set to  = 1/1000 (for both examined processes).The optimization was considered as successful if the values of all quality functions were less or equal than assumed accuracy .The computation was made for different number of stages  = 2, 5, 7, 10 and for different dimension of test function at each stage  = 1, 5, 7, 10 (the same value of  fixed for all stages in the chain).For each combination of the number of stages and the dimension of test function at each stage, optimization procedure was performed 100 times.Figures 4-7 present statistics for results of our tests.
The analysis of obtained results confirms some of expectations but the others are quite surprising.The optimal solution was much easier to find in case of square function using less objective function calls than in case of Rastrigin test function.When we increased the number of stages and/or the dimension of quality function at each stage, the probability of finding optimal solution decreased.Comparison of SEQ and SIM approaches shows that in case of low number of stages and low dimension of quality function SEQ approach is better than the SIM.But when number of stages and/or number of dimensions are high, the SIM approach turns to be better.Proposed modifications of SEQ and SIM approaches (the SEQC and SIMF) improve their performance.Difference between performance of SIM and SIMF is not visible.

Optimization of Industrial Lead Refining Process
The proposed methodology was used in optimization of industrial lead refining process presented in Figure 8.The considered process consisted of the following five stages: (1) melting and skimming, (2) decoppering through liquation, (3) decoppering by sulphur, (5) refining by oxygen.
Let us briefly present the technology behind the chosen process.Lead bullion of specific mass and a known chemical composition are fed to a refining boiler and heated to a temperature of about 450 ∘ C.After reaching the desired temperature, the lead is being melted and a layer of skims is formed on a surface layer, which are removed next.Then, as the result of stirring of the molten metal, temperature decreases to approximately 330 ∘ C and the separation of copper in solid forms below the 0.1% weight content in lead (theoretically 0.06%).At the same time, the concentration of arsenic contents in the lead decreases slightly.The remaining copper is removed by sulphur.Sulphur is discharged in elementary form and/or in the form of PbS to the funnel formed by rotation of the agitator.This allows a better use of sulphur creating a copper sulphide CuS, which rises to the surface of molten metal.The copper content decreases below 20 ppm and created skims are collected from the surface of the molten metal.After removal of the copper, the lead is pumped to the boiler where an oxidising refining occurs.First, the metal is heated to a temperature of 600 ∘ C. Then the oxygen is fed to bath through the lance with vigorous stirring.During this process, the oxygen creates a sequence of tin, arsenic, and antimony oxides, which rise to the surface.The oxides can form a complex form as a result of cross-reactivity between them.It is important to remove arsenic to a level below 10 ppm.This results in a lowering of the tin contents below 5 ppm, and the antimony to several hundred ppm, but this does not require the complete removal of antimony.Created dross is removed from the surface of the lead and the metal is directed to the further stages of refining [12,13].
The goal of optimization procedure was to find control parameters for all stages, which ensure the arsenic concentration in refined lead equal to 5 ppm.For the optimization purposes the processed lead bullion was described by following parameters: At each stage, time was chosen as the control parameter, so   was a positive real number for each .The changes of chemical composition of lead bullion during the refining process as well as temperature changes were modelled using Response Surface Methodology [14] based on industrial data.The quality of the semiproduct after each stage was computed based on selected component of vector x  .Therefore, quality functions took form: Three different optimization strategies were used to get the required concentration of arsenic in refined lead.In two cases optimization problem was formulated in local way, in third in global way.SEQ and SEQC approaches were applied when local problem formulation was chosen, and SIM approach in the case of global problem formulation.As in the case of virtual process, the PSO method was used as optimization procedure.The number of allowed evaluations of quality function in each stage was limited to  max = 200, while the required accuracy was set to  = 0.1.The optimization was performed 100 times to determine the probability of finding required solution.The comparison of results obtained using different optimization methodologies is presented in Figure 9.
The obtained results showed that when optimization problem is formulated in local way (SEQ and SEQC approaches), the probability of finding required solution is much higher.Comparison of SEQ and SEQC approaches confirms previous observations that SEQC approach gives better chances to find required solution using not much more number of objective function calls in comparison to SEQ.
Figure 10 presents values of probability densities of optimization parameters of the lead refining process, obtained from considered optimization strategies SEQ, SEQC, and SIM.Since probability of success is much higher in SEQ/ SEQC strategies, probability of fitting of the value close to optimum is also much higher than for SIM strategy.This is also confirmed by the analysis of the box plots of values of parameters obtained as a result of optimization (see Figure 11).As we can see, values for SIM are more spread than in the other two strategies.

Summary and Conclusions
The problem of optimization of multistage industrial processes was investigated in the paper.In such production chain there are many objective functions which assess quality of semiproducts.The typical multiobjective techniques may not be suitable to solve this kind of problems.In the paper a few strategies were developed and tested using two benchmark functions and industrial process of lead refining.
The presented results prove that problem of optimization of multistage process is highly not trivial and further investigation is required.
Based on presented results it is not possible to point out the best approach to deal with multistage industrial production chain; therefore, different combinations of presented strategies should be tested next.Most likely, some combinations of two different methods should work well, especially when the chain is long and have complex interactions.Other strategies known from the literature (e.g., dynamic programming) can be involved, making a kind of hybrid approach.Possible testing ground for our methodology which fits well into industrial environment is problems with finite duration (a limited number of executions of the chain before full reset) and possessing backward loops (feedback to previous stages; e.g., semiproducts returning to production cycle).In this case we may expand the whole process, obtaining a huge production tree, representing stages in different time (production stage with assigned number of executed production cycles).That way induced directed tree of stages can be large, even if in the genuine production chain there are only a few stages.This also gives many other possibilities for solutions.For example, one approach is to aggregate all stages within the loop as a larger stage (optimized separately).This is an interesting, however also challenging, problem to be investigated as a next step.

Figure 2 :
Figure 2: Production chain with acyclic graph structure.

Figure 4 :
Figure 4: The average number of objective function calls for each approach for square test function (15).Only data from successful optimization runs was taken into account.

Figure 5 :Figure 6 :
Figure 5: The average number of objective function calls for each approach for Rastrigin's test function (16).Only data from successful optimization runs was taken into account.

Figure 7 :
Figure 7: The probability of finding the global minimum for each approach for Rastrigin's test function (16).

6 Figure 10 :
Figure 10: Probability densities (all function calls).Scale for SEQ, SEQC strategies in blue and in green for SIM.

6 Figure 11 :
Figure 11: Box plots of obtained values of parameters for different strategies in optimization of the lead refining process.
4.1.Formulation of Optimization Problem.The proposed multistages optimization problem considered in this work can be formulated in two ways: global and local.The difference in these formulations lies in the aim of optimization procedure.In the global () formulation the objective of optimization procedure is to optimize the final product only.The qualities of semiproducts are not taken into account during the optimization procedure.The objective function has the following form: