Clustering and Genetic Algorithm Based Hybrid Flowshop Scheduling with Multiple Operations

This research is motivated by a flowshop scheduling problem of our collaborative manufacturing company for aeronautic products. The heat-treatment stage (HTS) and precision forging stage (PFS) of the case are selected as a two-stage hybrid flowshop system. In HTS, there are four parallel machines and eachmachine can process a batch of jobs simultaneously. In PFS, there are twomachines. Each machine can install any module of the four modules for processing the workpeices with different sizes. The problem is characterized bymany constraints, such as batching operation, blocking environment, and setup time and working time limitations of modules, and so forth. In order to deal with the above special characteristics, the clustering and genetic algorithm is used to calculate the good solution for the two-stage hybrid flowshop problem. The clustering is used to group the jobs according to the processing ranges of the different modules of PFS. The genetic algorithm is used to schedule the optimal sequence of the grouped jobs for the HTS and PFS. Finally, a case study is used to demonstrate the efficiency and effectiveness of the designed genetic algorithm.


Introduction
This research is motivated by a flowshop scheduling problem of our collaborative manufacturing company for aeronautic product.This company is composed of a set of multiple stages with each stage having parallel machines.According to our investigation, precision forging stage is the bottleneck because it has many WIP (work in progress) stock at different times.Therefore, we focus mainly on the scheduling method of this stage and its previous stage (heat-treatment).
The heat-treatment stage (HTS) combined with the precision forging stage (PFS) forms a two-stage hybrid flowshop system.Figure 1 describes the details of these two stages.As seen in Figure 1, the HTS has four parallel machines and the PFS has two parallel machines.The HTS can process five workpieces at one machine simultaneously, while the PFS can process one workpiece at one time.In PFS, four modules are used to process the different workpieces with different sizes, and the setup time and working time of each module have an effect on the efficiency of PFS.As a result, in PFS of collaborative company, the phenomenon that most of the modules' working time deviate their nominal working time often occurs.In addition, there are many WIP blocked for PFS because the multiple workpieces with different sizes can be simultaneously processed in each machine of HTS.
In contrast to general hybrid flowshop scheduling, some characteristics which emerged in this industrial case have substantiated the complexity of the problem.It is summarized as follows.
(1) In HTS, there are four parallel machines.Each machine can process a batch of jobs simultaneously.The method of loading job into these machines is unusual.It follows the principle that a new job can be loaded when one job is released.For each machine, only all the jobs which are loaded into the machine can be processed.It means that the start time of the batch of jobs is the loading time of the last loaded job of the machine.In this case, the capacity of each machine is five.
(2) In PFS of this case, there are two machines.Each machine can install any module of the four modules for processing the workpieces with different sizes, and the setup time and working time of each module should be considered.Each module has a processing range and can be reused.The workpieces with different sizes can be dealt with under the processing range of the module.But for each module, it should not be continuously used exceeding its working time at one time.Therefore, it is better to make all the modules could be used near to their working time at each time and try to avoid the setup of modules at the two machines simultaneously.
(3) In a general flowshop problem, buffers can be used to store jobs for waiting for the next process.However, in this case, the released jobs from HTS should be immediately transported to PFS because the workpieces will be affected with a large temperature difference.If PFS is not ready for a new job, the jobs must be blocked inside the machines although it is finished.
In order to deal with the above special characteristics, in this paper, the clustering and genetic algorithm is designed to calculate the good solution for this two-stage hybrid flowshop problem.The clustering is used to group the jobs according to the processing ranges of the different modules of PFS.The genetic algorithm is used to schedule the optimal sequence of the grouped jobs for the HTS and PFS.The research will improve the productivity for the manufacturing companies with these two-stage hybrid flowshop problems.
The rest of this paper is organized as follows.Section 2 reviews the relevant literature.The mathematical model of the two-stage hybrid flowshop problem is established in Section 3. Section 4 presents analysis on the clustering algorithm of jobs.A case study which is used to demonstrate the efficiency and effectiveness of the designed genetic algorithm is proposed in Section 5. Finally, conclusions and recommendations for future work are summarized in Section 6.

Literature Review
Early researches majorly focused on mathematical methodologies and solution algorithms.Rahendran and Chaudhuri [1] propose a multistage parallel-processor flowshop problem with minimum flowtime.Based a hybrid threestage flowshop problem, Riane et al. [2] propose an efficient heuristics to minimize makespan and branch and bound crossed with GA to solve hybrid flowshops.According to the literature about hybrid flow shop scheduling problems, Linn and Zhang [3] make a comprehensive survey.With the rapid development of information technology, computer technology and integrated manufacturing system, multiprocessor task scheduling has come into focus.Oǧuz et al. [4] propose the concept of hybrid flowshop scheduling problems with multiprocessor task systems; for multiprocessor task scheduling in multistage hybrid flowshops, Ying and Lin [5] present an ant colony system approach.Alaykýran et al. [6] make in-depth research under the ant colony system approach and use an optimal ant colony algorithm to solve hybrid flow shop scheduling problems.Luo et al. [7] investigate a two-stage hybrid flowshop scheduling problem in a metal-working company and develop a genetic algorithm to obtain a near-optimal solution.
The optimal algorithm for hybrid flowshop scheduling is always a difficulty.Taking into account complexity and efficiency of algorithm, many methods have been applied in the solution of hybrid flowshop scheduling problems.At early phase, Maccarthy and Liu [8], addressing the gap in scheduling research, made a review of optimization of heuristic methods in production scheduling.Chen [9] describes analysis of classes of heuristics for scheduling a two-stage flow shop with parallel machines at one stage.Wang and Zheng [10] present an effective hybrid heuristics for flow shop scheduling.Laha and Chakraborty [11] analyze the optimal methods based on the construction search.According to the real life, Rossi et al. [12] apply heuristics for scheduling a twostage hybrid flowshop with parallel batching machines at a hospital sterilisation plant.
Two scheduling situations are named no-wait and blocking scheduling when there are no buffer storages between stages.In no-wait scheduling (NWS), jobs must be processed from start to finish without any interruption either on or between machines.In blocking scheduling (BS), a job completed at a machine must wait there until the next machine gets ready.These problems are usual in various industries such as smart factory, high technology industry, and chemical processing industry and have taken substantial research attention.Cheng et al. [13] present a genetic algorithm for the multistage and parallel-machine scheduling problem with job splitting-A case study for the solar cell industry.Sirskandarajah [14] describes the performance of scheduling algorithms for no-wait flowshop with parallel machines.Caraffa et al. [15] use the genetic algorithm to search the minimum makespan in a blocking flowshop.Ruiz and Allahverdi [16] study no-wait flowshop with separate setup times to minimize maximum lateness.Liu et al. [17] propose an effective hybrid particle swarm optimization for no-wait flowshop scheduling; this hybrid particle swarm optimization has been verified, that is, a high-efficiency method.Combining with the blocking and no-wait scheduling together, Mascis and Pacciarelli [18] discuss the job-shop scheduling with blocking and no-wait constraints; survey indicates that the blocking scheduling and no-wait are paradoxical things.Luo et al. [19] study a hybrid flowshop scheduling with batch-discrete processors and machine maintenance in time windows.
Although significant progress has been made in above researches, it is difficult to directly apply the relevant model and method to this case because of the characteristics described in Section 1. Considering the problems faced in this case, the clustering model and genetic algorithm will be combined to solve this problem.

Mathematical Model
According to the two-stage (HTS and PFS) hybrid flowshop scheduling problem described in Section 1, a mathematical model should be constructed in order to conduct further studies.
Notations used through this paper are listed in notations section.
Based on the notations, the mathematical scheduling model is formulated as follows: objective function where, As seen in ( 1),  is the efficiency of the two-stage hybrid flowshop scheduling problem. 1 and  2 represent the efficiency of HTS and PFS, respectively. 1 and  2 are the weighting factors of HTS and FPS; we set  1 = 0.1 and  2 = 0.9 because the FPS is the bottleneck.In  1 , ∑  1 ∑  1    is the total processing time of  and ∑  1 ∑  1    is the total blocking of the jobs in the HTS.In  2 , ∑  ∑  (   , +   , ) is the total processing time of the two machines, ∑    is total waiting time in the PFS.
Constraints of this problem are where functions (3) and ( 4) are the constraints of HTS, where   is the processing time constraint of each   ,   is the blocking time constraint of each   .Functions ( 5) and ( 6) are the constraints of FPS; they are used to guarantee that each module has enough setup time and that each module does not work exceeding its nominal working time at each time of use.

Clustering and GA-Based Scheduling Method
4.1.Clustering Model.The clustering model is used to classify all jobs into multiple groups of size "" according to the similarity of job parameters.In order to make the grouped jobs more significant for the next step-GA design, it is essential to analyze the importance of the parameters of the jobs.
Two important parameters should be considered.The first is the size of the workpieces.The second is the processing time of the job in HTS.
Let set   = {V 1 , V 2 , . . ., V  } be the different values of size range according to the modules processing range of PFS.
Let set  PT = {V 1 , V 2 , . . ., V  } be the different values of processing time of all the jobs.
For each size of   , each different value is a split node.The jobs firstly are classed into several groups at each value of V  .Then, these grouped jobs are further grouped according to the value of  PT .
For the different processing time of each job, a cluster model is designed as follows.
Step 1.Let  = { 1 ,  2 , . . .,   } records each group (  ) after the jobs are classed by all the values   = {V 1 , V 2 , . . ., V  }, respectively.For each   ,   ∈ , it means there are "" (the size of   ) jobs in this group with the different processing time parameters.
Step 2. Use a dynamic set   to record the result of the subgroups of   ; the size of   is zero at the beginning.
Step 2.1.Reclass the jobs of   using processing time parameters  PT = {V 1 , V 2 , . . ., V  } by means of the same method of size parameter in   .
Step  1.Here,  is the size of    and  is a constant value of jobs that need to be simultaneously processed in the same machine.
After the above steps,   consisted of many small groups and each group consists of "" jobs according to the size and processing time parameters.Then, the formed   is ready for GA-based scheduling.

GA-Based Scheduling Method.
The flows of designed GAbased scheduling method follow the standard flows of genetic algorithm.It includes the following main parts.

Regroup rules 𝑠 ≤ 𝑐
The    is dynamically added to

𝑠 > 𝑐
Use the operator of division to divide the size of    by , and let  be the aliquot part and  the remainder.If  is equal to zero, the    will be decomposed into "" groups and each group has "" jobs.If  is not equal to zero, the    will be decomposed into "" groups and another group consists of "" jobs.The groups created during this process are also dynamically added to   .

Gene and Chromosome.
As described in Section 4.1, the "" jobs have been grouped with the same size according to the size and processing time parameters.Therefore, the matrix  = { 1 ,   , . . .,   } will be the important information to design genes.An integer-based method is used to form the genes and chromosomes.The genes consist of a pair of integers like (-), 1 ≤  ≤ , 1 ≤  ≤ 4; here m is the length of the set  and  is the corresponding module of PFS.Obviously, each gene includes a group with "" jobs formed in clustering stage and the corresponding module used in PFS.
Based on genes, the chromosome can be defined as a queue that consists of different genes and char ":".The length of the chromosome is the sum of the different genes.And it is decided by the length of set .For example, chromosome (1-2:3-3:2-1:5-4:6-1) consists of five genes and its length is five.

Decoding Scheme.
The decoding scheme is also designed according to the gene and chromosome.And the purpose is to interpret the meaning of the chromosome.The char ":" which appeared in chromosome is to connect the genes, and the integer (-) which appeared in chromosome presents a group (  ) of jobs which is defined in .The appearing order of  of (-) presents the processing order of each group (  ) and will use the module "" in the next stage.For example, the chromosome (1-2:3-3:2-1:5-4:6-1) means that there are five groups ({ 1 ,  3 ,  2 ,  5 ,  6 }) of jobs which will be processed by { 2 ,  3 ,  1 ,  4 ,  1 }.

Fitness Function.
Fitness function is used to calculate the value of each individual in order to determine the performance of chromosomes of the entire population.In this case, the objective function ( 1) is used to be the fitness function to evaluate the chromosome.

Design of Selection, Crossover, and Mutation Operators.
Selection operator is used to select the optimal parents according to the fitness for generating new offspring.
Crossover operator is used as main genetic operator and the performance depends on it.In this case, a multipoint crossover operator is designed to permute the job order as well as keep the legality of the generated chromosome.
The proposed crossover operator takes two parents and creates two offsprings.It propagates the structure and subschedule into offspring from one parent.And then it completes the offspring with remaining jobs derived from another parent.The detailed steps can be seen in Algorithm 1.
Mutation operator just works on a single chromosome and generates offspring by altering one or more genes.

Case Study
Two hundred jobs are selected to simulate the problems of the collaborative company and verify the effectiveness of the designed clustering and GA method.Table 2 shows the parameters of the four models.Table 3 shows the information of the jobs (  ), where the columns size and processing time are the initial data of the jobs, and the blocking time and waiting time are the calculated data according to the optimal solution gotten by the designed algorithm.
Table 4 shows the information of grouped jobs in HTS, where size range and processing time are the values of each group jobs (  ), blocking time is the minimum blocking time of all the jobs in this group, and  1 means the efficiency of HTS; it is calculated by Table 5 shows the information of grouped jobs in PFS, where size range and processing time are the values of each group jobs (  ); waiting time is actual waiting time of all the jobs in this group.Table 6 shows the scheduling result of the four modules in PFS, where   , is working time of Model  in machine (),   , is the setup time of Model  in machine (),  means the th loading, and   is wait time of the machine , which is equal the sum of waiting time of jobs in this machine Step 1. Create two crossover points to form crossover-section. Randomly select two parents i and j.
Randomly create two integrate a and b,  < ; ,  ∈ [1, ] The genes from a to b of parents i and j are the crossover-sections.
String [] tempI = The genes from the ath to bth of the parent "i" String [] tempJ = The genes from the ath to bth of the parent "j" Step 2. Use symbol "0" to replace the genes.
Find the genes of tempI in parent "j" Use "0" to replace the genes of the parent "i" String [] tempI0 = The new parent "i" with "0" Find the genes of tempJ in parent "i" Use "0" to replace the genes of the parent "j" String [] tempJ0 = The new parent "j" with "0" Step 3. Make "0" move to the crossover-section FOR each new parent (tempI0; tempJ0) DO Make the symbols "0" move from the both extremities to centre until they reach the cross-section minus the setup time of this machine.The efficiency of PFS is calculated by The result shows that the two stages have good efficiency.The efficiency of HTS is 88.31% and the efficiency of PFS is 95.34%.The efficiency of PFS is high, and each module can be used near to its notational processing time.Therefore, it is a conclusion that the computational solution is a suitable solution for this scheduling problem.

Conclusion
This paper introduces a clustering and genetic algorithm based method to solve the scheduling problem of a twostage, HTS and PFS, hybrid flowshop problem.This problem is characterized by many constraints, such as batching operation, blocking environment, and setup time and working time limitations of modules.The clustering is used to group the jobs according to the processing ranges of the different modules of PFS.The genetic algorithm is used to schedule the optimal sequence of the grouped jobs for HTS and PFS.The designed algorithm is proven useful through a case study.Several advantages such as faster processing time, better utilization, and higher efficiency of HTS and PFS come out with this method.Currently, we only use the data simulation to verify the designed clustering and genetic algorithm.According to the further requirements of the collaborative company, this method should be improved for application.Furthermore, more practical experiences such as line balance of PFS will be considered and added to mathematical model.This will help to optimize the GA model and help to achieve better scheduling results for real-life manufacturing companies.
, namely from a to b END FOR String [] tempI1 = The new parent "i" with centre "0" from a to b String [] tempJ1 = The new parent "j" with centre "0" from a to b Step 4. Create new parents Use tempJ to replace of the "0" from a to b of temI1 String [] parentI = new formed temI1 Use tempI to replace of the "0" from a to b of temJ1 String [] parentJ = new formed temJ1 Outputs: String [] parentI; parentJ Algorithm 1: Steps of crossover operator with multi-point.
Th e t o t a l n u m b e r o f j o b s  = { 1 ,  2 , . . .,   }: Thesetof jobs  = {1, 2, 3, 4}:  represents the type of the modules in PFS; there are four types of setup time of module     : Nominal working time of module  at each time of use : Th e n u m b e r o f g r o u p e d j o b s after clustering  = { 1 , 2.2.Let   = {  1 ,   2 , . . .,    } be a temp array to record each new group (   ) of group   after being reclassed by different values of processing time { 1 . . .,   }.For each    , rules should be obeyed to regroup the size of    .The rules can be seen in Table

Table 1 :
The rules of regroup.

Table 2 :
Parameters of the modules.

Table 3 :
Data of jobs.

Table 4 :
Data of grouped jobs in HTS.

Table 5 :
Data of grouped jobs in PFS.

Table 6 :
Scheduling result of modules in PFS.
2 , . . .,   }: The set of  grouped jobs; each   has five jobs   : Th e th in group (  ) of the HTS  = { 1 ,  2 }: Represents the number of machines of the two stages    : The processing time of each job (  ) in group (  ) in HTS; it consists of three processing times    : The blocking time of job (  ) in group (  ) in HTS   : The minimum blocking time of each job (  ) in group (  )  , : Th e th processing time in the model  of FPS;  means the th loading  1 = 0.1,  2 = 0.9: Start processing time of the th processing time in the model  in the machine () of FPS;  means the th loading    , : Finish processing time of model  in the machine () of FPS;  means the th loading.
, :The setup time of Model  in machine () of FPS;  means the th loading    : Start time of the (  ) in HTS    : Finish time of the (  ) in HTS