Energy Saving in Flow-Shop Scheduling Management: An Improved Multiobjective Model Based on Grey Wolf Optimization Algorithm

Currently, energy saving is increasingly important. During the production procedure, energy saving can be achieved if the operational method and machine infrastructure are improved, but it also increases the complexity of ﬂow-shop scheduling. Actually, as one of the data mining technologies, Grey Wolf Optimization Algorithm is widely applied to various mathematical problems in engineering. However, due to the immaturity of this algorithm, it still has some defects. Therefore, we propose an improved multiobjective model based on Grey Wolf Optimization Algorithm related to Kalman ﬁlter and reinforcement learning operator, where Kalman ﬁlter is introduced to make the solution set closer to the Pareto optimal front end. By means of reinforcement learning operator, the convergence speed and solving ability of the algorithm can be improved. After testing six benchmark functions, the results show that it is better than that of the original algorithm and other comparison algorithms in terms of search accuracy and solution set diversity. The improved multiobjective model based on Grey Wolf Optimization Algorithm proposed in this paper is conducive to solving energy saving problems in ﬂow-shop scheduling problem, and it is of great practical value in engineering and management.


Introduction
Many mathematical problems in scientific research and practical engineering essentially belong to multiobjective optimization problem. e analysis of constrained multiobjective optimization algorithm has become a research hotspot in recent years.
Different theories exist in the literature regarding optimization algorithm such as the Improved Multiobjective Grey Wolf Optimizer (IMOGWO) that hybridize with the fast nondominated sorting strategy [1]. In fact, significant progress has been made in solving constrained multiobjective optimization problems at home and abroad due to its efficiency and simplicity [2], but there is still much room for improvement in terms of the diversity and convergence of solution sets.
In previous research, some scholars proposed a differential evolution algorithm based on two-population search mechanism, which randomly deletes one of the two individuals with the smallest Euclidean distance [3]. In this way, the boundary solution may be lost, and the diversity of solution set may be affected. At the same time, when updating the infeasible solution set, the individuals with a small degree of constraint violation are preferred, but the individual objective function value selected in this way may be poor, thus affecting the convergence speed of the algorithm.
Several lines of evidence suggest that a number of penalty terms were applied to modify the value of individual objective function. In the process of evolution, feasible nondominant solutions were retained, and infeasible solutions with a small degree of constraint violation were also retained [4]. is algorithm also has the condition of losing boundary solution, which indicates that it has some defects in the diversity maintenance of solution set. When updating feasible solution sets and infeasible solution sets, individuals located in the sparse region are given priority. However, when updating the infeasible solution sets, individuals with a large degree of constraint violation will be retained, thus reducing the convergence speed of the algorithm [5].
As for the improved elite selection strategy, it can make the solution set more widely distributed by setting preference points and expand the application of constrained multiobjective optimization algorithm to high-dimensional problems by combining with Deb criterion [6], but it still has some drawbacks.
Up to now, plenty of differential evolution algorithms have been proposed, which minimize the value of the objective function for the feasible solution and minimize the degree of constraint violation for the infeasible solution [7]. However, the information interaction between the feasible solution set and the infeasible solution set is insufficient, and the population diversity needs to be improved. On the other hand, some scholars proposed a constrained multiobjective optimization algorithm based on adaptive e truncation strategy, which can improve the diversity of solution sets and give consideration to the convergence [8]. But the selection of e parameters is difficult, which needs to be adjusted according to different problems [9]. In brief, it is difficult for most algorithms to achieve enough balance for the key performance indexes in the constrained multiobjective optimization problems in terms of diversity and convergence [10][11][12]. erefore, we propose an improved Multiobjective Grey Wolf Optimizer related to Kalman filtering and reinforcement learning (MKGWO) in this paper. e main innovation of the algorithm is that Kalman filter facilitates the convergence from solution set to Pareto optimal front end introduced into the static multiobjective algorithm. It combines the characteristics of Kalman filter with the robustness, reliability, and high efficiency of the reinforcement learning system when solving problems [13]. In the process of evolution, the algorithm uses an elite population to store feasible nondominant solutions and retains the nondominant solutions generated by historical iteration. e problem of scheduling research is to allocate scarce resources to different tasks within a certain period of time [14,15]. As the production with the continuous expansion of scale, the importance of scheduling and decision-making to enterprise management and production is increasingly prominent [16]. e scheduling problem is an interdisciplinary field of research, which involves operations research, computer science, control theory, industrial engineering, and many other disciplines [17]. A good scheduling scheme can greatly improve the production level of enterprises, making rational use of resources and enhancing the market competitiveness [18][19][20][21].
From a different perspective, combining the data mining technology and mathematical logic, we establish an improved multiobjective operation model based on Grey Wolf Optimization Algorithm to give consideration to energysaving problems in engineering. e results show that the algorithm can solve the Pareto front end problem in flowshop scheduling successfully, and it is of great practical value in engineering and management.

Multiobjective Grey Wolf Optimizer
2.1. Multiobjective Problem. As briefly mentioned in the introduction, multiobjective optimization refers to the optimization of a problem with more than one objective function [22,23]. Without loss of generality, it can be formulated as a maximization problem as follows: subject to: where n is the number of variables, o is the number of objective functions, m is the number of inequality constraints, p is the number of equality constraints, g i is the ith inequality constraints, h i indicates the ith equality constraints, and [L i , U i ] are the boundaries of ith variable [24][25][26].
In single-objective optimization, solutions can be compared easily due to the unary objective function. For maximization problems, solution X is better than Y if and only if X > Y. However, the solutions in a multiobjective space cannot be compared by the relational operators due to multicriterion comparison metrics. In this case, a solution is better than (dominates) another solution if and only if it shows better or equal objective value on all of the objectives and provides a better value in at least one of the objective functions [27]. e concepts of comparison of two solutions in multiobjective problems were first proposed by Khamis et al. [28] and then extended by Khamis et al. [29]. Without loss of generality, the mathematical definition of Pareto dominance for a maximization problem is as follows [30].
(2) e definition of Pareto optimality is as follows.

Definition 2 (Pareto optimality).
A solution x ∈ X is called Pareto-optimal if A set including all the nondominated solutions of a problem is called Pareto-optimal set, and it is defined as follows.

Mathematical Problems in Engineering
Definition 3 (Pareto optimality set). e set of all Pareto-optimal solutions is called Pareto set as follows: A set containing the corresponding objective values of Pareto-optimal solution in Pareto-optimal set is called Pareto-optimal front [31]. e definition of the Paretooptimal front is as follows.
A set containing the values of objective functions for Pareto solutions set is

MOGWO. MOGWO algorithm was proposed by
Holland [32]. e social leadership and hunting technique of grey wolves were the main inspiration of this algorithm. In order to mathematically model the social hierarchy of wolves when designing MOGWO, the fittest solution is considered as the alpha (α) wolf. Consequently, the second and third best solutions are named beta (β) and delta (δ) wolves, respectively. e rest of the candidate solutions are assumed to be omega (ω) wolves. In the GWO algorithm, the hunting (optimization) is guided by α, β, and δ. e ω wolves follow these three wolves in the search for the global optimum [33][34][35]: where t indicates the current iteration, A and C are coefficient vectors, X p is the position vector of the prey, and X indicates the position vector of a grey wolf [36]. e vectors A and C are calculated as follows: where elements of a linear decrease from 2 to 0 over the course of iterations and r 1 , r 2 are random vectors in [0, 1]. Position updating mechanism of search agents and effects A is indicated in Figure 1 [37]. In this figure, we can see that the three top wolves (namely, the fittest solutions) guide the directions of other wolves (namely, the candidate solutions). e following formulas are run constantly for each search agent during optimization in order to simulate the hunting and find promising regions of search space: C is a random value that is generated in [0, 2]. e storage mechanism of the nondominant solution is grid in MOGWO. When the archive is full, each hypercube is taken out by roulette according to the probability: where c is a constant number greater than one and N is the number of obtained Pareto-optimal solutions in the segment [38].

Defect in MOGWO. Traditional Multiobjective Grey
Wolf Optimizer is a grey wolf group predation was inspired by multiobjective optimization algorithm, using a fixed external file to store nondominated solution, simple multiobjective grey wolves optimizer in solving static multiobjective problem, because without a good promotion strategy, lead to being not close to the Pareto-optimal front end, and the diversity of solution set is not high [39]. To solve the above problems, an improved algorithm KMGWO is proposed in the next section.

Kalman Filter.
In 1960, R. E. Kalman published a paper describing a method which can process a time series of measurements and predict unknown variables more precisely than that based on a single measurement alone. is is referred to as the Kalman filter. Kalman filter maintains state vectors, which describe the system state, along with its uncertainties. e equations for the Kalman filter fall into two groups, time update and measurement update equations, which are performed recursively for the Kalman filter to make prediction. Here, the Kalman filter is used to directly predict for future generations in the decision space, and the two major steps are described below [40].

Measurement
Update. e measurement update equations are responsible for incorporating a new measurement into the a priori estimate to obtain an improved a posteriori estimate [41]. e individual solutions just before the change occurs are taken as the actual measurements of the previous predictions. is information is used to update the Kalman filter prediction model [42].

Time Update.
e time update equations are responsible for projecting forward the current state and error estimate covariance estimates to obtain the a priori estimates for the next step. New solutions are predicted based on the corrected Kalman filter associated with each individual in the decision space [43]. ese are a priority estimates of the future.
Pareto-optimal solutions will then be used to update the reference points and subproblems. e specific equations for the two steps are presented in the following [44]: Time update step: Measurement update step: where x is the state vector to be estimated by the Kalman filter, A denotes the state transition model, u is the optional control input to the state x, B is the control input matrix, and P is the error covariance estimate [45]. Z denotes the measurement of the state vector, H is the observation matrix, and the process and measurement noise covariance matrices are 2 and R, respectively. K is the Kalman filter gain.
Here is an example, so we can understand the Kalman filter more intuitively.
As shown in Figure 2, the state vector of an object at the period t obeys the magenta normal distribution, according to which the position of the object at period t + 1 can be predicted, and the predicted value is the blue normal distribution. is blue normal distribution is getting fatter, because a layer of noise to the recursion is added, so the uncertainty is getting bigger. In order to avoid the deviation caused by pure estimation, we made a measurement on the position of the object at the period of t + 1, and the measurement results are subject to the red normal distribution.
rough the five equations of Kalman filter, the real state vector in the car at the time of t + 1 can be obtained. e state vector follows the green normal distribution, which means that the prediction of Kalman filter can be carried out iteratively.
Kalman filter has been widely applied to the dynamic multiobjective algorithm to ensure that the dynamic multiobjective algorithm can converge to the Pareto-optimal front end in time when the problem changes. It can be said that, at present, Kalman filter is one of the most effective methods to make the population converge to the Paretooptimal set and the solution set converge to the Paretooptimal front end. erefore, this paper reversely applies it to the static multiobjective algorithm to promote the static multiobjective algorithm to converge to the Pareto-optimal front end faster.
In order to overcome the defects of MOGWO described above, this paper proposes a multiobjective Grey Wolf Algorithm based on Kalman filter transformation, which is hereinafter referred to as MKGWO.
After each iteration, a new grey wolf population is generated by the newly generated grey wolf population and the previous-generation grey wolf population using Kalman filter through update probability Pu, which promotes the convergence of solution set to the Pareto-optimal front end.
is strategy is called Kalman filter operator, and the formula is described as follows: 3.2. Reinforcement Learning. In the field of big data and machine learning, data mining and learning technology can be divided into supervised learning, unsupervised learning, and reinforcement learning. Reinforcement learning grew out from animal learning and parameter perturbation adaptive control theory, referring to the mapping from environmental state to the action. It is a machine learning method that can adapt to and interact with the environment. is method is different from supervised learning through positive cases and counterexamples to advise the agent of what action to take, but by trial and error to find the optimal behavioral strategy.
As is shown in Figure 3, the basic principle of reinforcement learning is test evaluation process, and the agent chooses an action for the environment, after which the action state change is accepted at the same time producing a reinforcement signal. en, the agent selects next action according to the reinforcement signal and environmental current state, and the selection principle is increased by the probability of positive reinforcement. e selected action not only affects the immediate reinforcement value, but also the state of the environment at the next moment and the final reinforcement value.
In MOGWO improved based on Kalman filter operator, we found that Pu (update probability) could not be fixed. For different problems, the optimal vaccination probability was different. For the same problem, the optimal vaccination probability is also different at different periods of iteration, so we choose the reinforcement learning method here to dynamically determine the vaccination probability. In this paper, this improved method is collectively called reinforcement learning operator, as shown in Figure 4. e reinforcement learning method used here is based on snap-drift neural network. It switches between snap mode and drift mode. In this operator, agent (MOGWO) accepts the state (snap or drift) and reward value (Pm) at time t then takes an action (increase or decrease Pu) to convert to a new state: Se is the number of nondominated solutions generated in this iteration; Pm is the conversion probability, that is determined by the proportion of update individuals in population this iteration. Snap mode is used for less than 50%, and drift mode is used for more than or equal to 50%. ω is the step size of Pu with each change.

Simulation Experiments
To test the performance of MKGWO, MKGWO and MOGWO, MOPSO, NSGA2, MOEA/D, and PESA2 simulation experiment, the benchmark functions and correlation index are analyzed in this section.

Experimental Environment.
e operating environment of the simulation experiment is as follows: the machine is Dawning 5000A supercomputer. Xeon X5620 CPU (4 cores) * 2, 24 GB memory, 300 GB SAS hard disk. Equipped with Rhel5.6 operating system. e programming tool is MATLAB 2012a (for Linux).

Benchmark Function.
In this paper, six benchmark functions are selected to evaluate the performance of the algorithm. is group of benchmark functions is widely used in the test of multiobjective optimization algorithm. e function names, dimensions, ranges, and expressions are shown in Table 1. ese six benchmark functions can be divided into two categories: Kursawe, Schaffer, ZDT1, and ZDT6 are two-dimensional test functions used to investigate the search ability of the algorithm at low latitude; Viennet2 and Viennet3 are 3D test functions, adding more Pareto points and increasing the difficulty of searching, for further detecting the overall performance of the algorithm. ese test problems are considered as the most challenging test problems in the literature that provide different multiobjective search spaces with different Pareto-optimal fronts: convex, nonconvex, discontinuous, and multimodal.  Run the grid mechanism to omit one of the current archive solutions Add the new solution to the archives End if X α � Select Leader(archive) Exclude alpha from the archive temporarily to avoid selecting the same leader X β � Select Leader(archive) Exclude beta from the archive temporarily to avoid selecting the same leader Add back alpha and beta to the archive ALGORITHM 1 : MKGWO flow framework.
Mathematical Problems in Engineering

Contrast Indicators and Algorithm Parameters.
For the performance metric, we have used Inverted Generational Distance (IGD) for measuring convergence. e Spacing (SP) is employed to quantify and measure the coverage. e mathematical formulation of IGD is similar to that of Generational Distance (GD).
is modified measure formula is as follows: where n is the number of true Pareto-optimal solutions and d indicates the Euclidean distance between the ith true Pareto-optimal solution and the closest obtained Paretooptimal solutions in the reference set. e Euclidean distance between obtained solutions and reference set is different here. In IGD, the Euclidean distance is calculated for every true solution with respect to its nearest obtained Pareto-optimal solutions in the objective space. e mathematical formulation of the SP and MS measures is as follows: In the simulation experiment, the population number of each algorithm is 200, the number of archives is 200, and the number of iterations is 500. Each algorithm ran independently for 30 generations, and its minimum value, maximum value, average value, and variance were taken as the results. e remaining parameters are shown in Table 2.

Comparative Analysis of Experimental
Results. e Pareto diagram in Figures 5-10 shows that the Pareto nondominant solution generated by KMGWO is basically consistent with the real Pareto front end and the solution set distribution is relatively uniform. en, the simulation experiment results are analyzed by data comparison. In the Kursawe function test results, the indicators of KMGWO and MOEA/D are the best, around 0.12, slightly larger than PESA2. e worst is NSGA2 algorithm. It shows that in addition to KMGWO, MOEA/D is also suitable for solving low-dimensional multiobjective problems. e test results of Schaffer's function show that KMGWO has the best effect. MOGWO and MOEA/D are slightly worse than KMGWO, and the standard deviation of MOGWO is smaller. In both Viennet2 and Viennet3, the results of the three-target benchmark test functions, PESA2 and KMGWO, have the best effect of two algorithms, respectively. In Viennet2, KMGWO is only slightly less than PESA2 algorithm. Taken together, KMGWO is very suitable for solving three-target optimization problem; this may be associated with the algorithm search ability being stronger. e next step research direction can test the effect of this algorithm in the target problem; it is very exciting, and in the two test functions, KMGWO's standard deviation is also very good. It shows that the stability of the algorithm in this kind of test function is excellent. In this paper, two of the functions of ZDTsystem proposed by Deb are selected: ZDT1 and ZDT6. e test results of ZDT1 function showed that the best algorithm was MOGWO, MOPSO was slightly inferior, and KMGWO was excellent and ranked third. e test results of ZDT6 function

Mathematical Problems in Engineering
show that KMGWO is the best algorithm with excellent stability. In general, KMGWO's ability to approach the real Pareto front end is very strong, especially when the number of targets is high, so the algorithm will be widely used in production.
Concerning IGD metric, the merit is clear that the KMGWO significantly dominates over the KMGWO on almost the problems. As shown in Table 3, the KMGWO is the best performing method in our comparison experiment.
is strongly demonstrates that the reinforcement learning operator can effectively improve the overall performance of the algorithm. e reason for this superiority of MKGWO lies in reinforcement learning operator as follows: in reinforcement learning operator, the optimal vaccination probability is also different at different periods of iteration; population is divided into many subpopulations and each solution has its own neighbors. Table 4 is the statistics of SP. SP value represents the degree of uniformity between Pareto solutions. e smaller this value is, the more homogeneous the Pareto solutions obtained by the algorithm will be, and the smaller the distance difference between them will be. Kursawe's function test results show that NSGA2 SP value is the smallest of these six algorithms, 1.4, and MOGWO, MOEA/D, and PESA2 with poor results, suggesting that NSGA2 simple double objective test functions can generate the most homogeneous solution set. It is worth noting that KMGWO effect is better than MOGWO; Kalman filtering and reinforcement   In this test function, the performance of KMGWO is the

Energy Consumption considering Flow-Shop
Scheduling. e low-carbon scheduling problem in the flow shop studied in this section can be described as follows: n jobs need to go through m stages in the same flow direction, and each job has only one working procedure in each stage. Considering the preparation time of the machine, the preparation time is related to the ordering of the two adjacent jobs. erefore, the preparation time is based on the sequence correlation of the artifacts, and the machine startup is linked to the processing time of the artifacts.
At different stages, the machine has different speed gears for production and can be adjusted. From the point of view of energy consumption, the machine has four different states: (machine) on the machining processing status, start state in preparation for a new jobs (machine), standby (machine is in idle), and turned off (machine is turned off). Under normal circumstances, when the machine is working at a higher rate, the processing time will be shortened, but the corresponding energy consumption will increase. erefore, this problem aims to maximize the completion time and energy consumption index. Due to the characteristics of the problem studied in this chapter, this problem is much more complicated than the traditional flow-shop scheduling problem. In this problem, other settings are as follows.
e job is processed continuously in the workshop. In other words, the process cannot be interrupted. Machines are allowed free time and have unlimited buffers between phases. When there is a first job processing, the machine boots. When all the jobs are finished, the machine is shut down. e machine speed cannot be adjusted in the course of a job processing.
In order to present the mathematical model of the problem, we first defined the following related mathematical symbols according to the above description of the problem. e two-dimensional decision variable, when the job from I to j is processed at a rate of k, is 1; otherwise the value is 0. z i,i′,j : Two-dimensional decision variable, when the job I ′ is processed onstage j, the value is 1; otherwise the value is 0. T on j : Auxiliary decision variable, indicating the startup time of stage j machine，decision by b i,j and x i,j,k,v . T off j : Auxiliary decision variable, indicating the halt time of stage j machine，decision by b i,j and x i,j,k,v .
Based on these mathematical symbols, the mixed-integer programming model of the flow-shop low-carbon scheduling problem is presented as follows: Objective function: 10 Mathematical Problems in Engineering Constraint condition: v∈V j Formula (29) stands for the minimum completion time. Formula (30) represents the minimum energy consumption. Formula (31) represents the total energy consumption when the machine is in the processing state, while formula (32) represents the total energy consumption when the machine is in the starting state, and formula (33) represents the total energy consumption when the machine is in the standby state. Formula (34) means that each job traverses all stages, and in a specific stage, each job is assigned to a machine and processed at a speed level. Formula (35) means that interruption is not allowed during processing. Formula (36) ensures that the job operation can only be started after the operation of the previous stage is completely processed. Formulas (37) and (38) guarantee the machine capacity limit, which means the job can be processed only after the previous work is completed. Formula (39) means that the machine starting to process immediately after installation is complete. Formula (40) represents the calculation of auxiliary variables, which is equal to the minimum value between the start time and the set time of the job assigned to the corresponding machine. Formula (41) represents the calculation of auxiliary variables, which is equal to the maximum value of the end time of the job assigned to the corresponding machine. Formulas (42)- (44), respectively, define the feasible range of decision variables.
is section gives a simple example of three jobs and three stages, each with three different processing speeds. Table 5 shows the processing time and corresponding power consumption of the machine at each stage; that is, the element in the table represents (p i,j , pp i,j,v ). Table 6 shows the sequence-dependent start times; the element in the table represents s i′,i,j . Set sp j � 3; ip j � 1.

Experimental Result.
To solve the above pipeline scheduling problem, this paper sets the parameters of KMGWO as follows: population number 20, warehouse number 20, 100 iterations. Figure 11 shows the Pareto front end obtained by KMGWO algorithm. It can be clearly observed that this problem is not complicated for KMGWO, and the Pareto-optimal solution set is easily obtained. Because there was not enough prior knowledge, we chose the scheme calculated at the Pareto point with the longest time of 113 and energy consumption of 1059. Table 7 is the Pareto point calculated plan, showing the start time of the job in each process, warm-up time, working time, and end time. We, through these data, drew a Gantt chart, as shown in Figure 12; this is a time and energy balance options assembly line scheduling Gantt chart. In Table 5: Processing time and corresponding power.

Process
Job  Table 6: Sequence-dependent start-up time.

Conclusions
In this paper, an improved multiobjective operational model based on Grey Wolf Optimization Algorithm related to Kalman filtering and reinforcement learning (KMGWO) is proposed, which is the combination of data mining technology and mathematical logic. With Kalman filter, the algorithm promotes the understanding set to the real Pareto front end. e reinforcement learning operator is applied to enhance the utilization of the dominant position of the group, and adaptive parameters are used instead of human intervention. e results of six benchmark functions show that the algorithm performs better than the comparison algorithm in terms of approximating the real Pareto optimal solution set and keeping the solution set uniform. Considering the energy saving of the assembly line scheduling solution, KMGWO performance is excellent and accordingly suitable for solving the practical optimization problems. is operational model has the formidable superiority in the field of mathematical optimization, which can be applied to machine learning, engineering optimization design, and other important areas, thus enhancing the performance of energy saving in production management.

Data Availability
e data used to support the findings of this study are included within the article.