A Makespan Optimization Scheme for NP-Hard Gari Processing Job Scheduling Using Improved Genetic Algorithm

An optimization scheme for minimizing makespan of Gari processing jobs using improved initial population Genetic Algorithm (GA) is proposed. GA with initial population improved by using job sequencing and dispatching rules of First Come First Served (FCFS), Shortest Processing Time (SPT), Longest Processing Time (LPT), and Modified Johnson’s Algorithm for m-machines in order to obtain better schedules than is affordable by GA with freely generated initial population and by individual traditional sequencing and dispatching rules was used.The traditionalGA crossover andmutation operators aswell as a custom-made remedial operator were used together with a hybrid of elitism and roulette wheel algorithms in the selection process based on job completion times. A test problem of 20 jobs with specified job processing and arrival times was simulated through the integral 5-process Gari production routine using the sequencing and dispatching rules, GA with freely generated initial population, and the improved GA. Comparisons based on performance measures such as optimal makespan, mean makespan, execution time, and solution improvement rate established the superiority of the improved initial populationGAover the traditional sequencing and dispatching rules and freely generated initial population GA.


Introduction
One of the most important products obtained from the processing of cassava is "Gari." Gari is a creamy-white, granular flour with a slightly fermented flavor and a slightly sour taste made from fermented, gelatinized fresh cassava tubers.Gari processing industries occupy a substantial portion of small and medium enterprises (SMEs) in Nigeria.In the past few decades, research on Gari production has yielded tremendous gain particularly in the areas of developing Gari processing machine and improving on its quality.However, little or no attention has been given to scheduling of customers' orders in a way that would improve delivery performance and inventory management and reduce production cycle times and overall cost associated with the production process.Hence, operational bottlenecks are often experienced in the day to day activities while the desire for appropriate processor becomes imperative.
Job scheduling in a Gari processing firm is analogous to a flow shop in which a set of n-jobs have to be processed with identical flow patterns on m-machines.In their works [1][2][3] developed scheduling models on the following assumptions.
(i) Each of the n-jobs has the same ordering of machines for its process sequence.
(ii) At a time, every job is processed on one and only one machine, which means no job splitting.
(iii) Each of the m-machines can process only one job at a time.
(iv) The operations are not preemptible.
(v) The operation processing times on the machines are known and fixed.
(vi) Setup times of operations are independent of the sequences and therefore can be included in the processing time. 2

Journal of Industrial Engineering
The Gari processing scheduling is an NP-hard problem [3] making it computationally difficult because with increasing number of jobs the computation time increases exponentially.There are no known algorithms for finding optimal solutions in polynomial time.Consequently, most researches are devoted to either simplifying the scheduling problem to the point where some algorithms can find solutions or devise efficient heuristics for finding better solutions.Motivated by the computational complexity of the problem, a makespan optimization scheme using genetic algorithm is presented here to develop a structure for the suitable scheduling of orders (jobs) in a Gari processing firm.In practice, the job shop scheduling has been approached mainly by using dispatching rules with the objective of finding a sequence that minimizes the makespan.For problems with 2 machines or 3 machines under specific constraints on job processing times, the efficient Johnson's algorithm [4] obtains an optimal solution for the problem.However, since this scheduling problem is NP-hard [5] the search for an optimal solution centres on more theoretical than practical importance.Therefore, since the 1960s a number of heuristics that provide near optimal or good solutions with limited computation effort have been proposed for flow shop sequencing.The performance of some earlier heuristics was evaluated in [6].Heuristics can be classified according to two major categories: constructive or improvement methods.The constructive algorithms obtain directly a solution for the scheduling problem, that is, an n-job sequence, by using some procedure which assigns to each job a priority or an index in order to construct the solution sequence (see, e.g., [6][7][8][9][10]).An improvement method starts from a given initial solution and looks for a better one normally by using some neighborhood search procedure.Metaheuristics can also be considered as improvement heuristics.Among these groups of techniques are genetic algorithms (GAs), simulated annealing (SA), tabu search (TS), and their hybrids.The first proposed metaheuristics for the permutation flow shop scheduling problem (PFSP) are the simulated annealing algorithms by [11] demonstrating different tabu search approaches.Other algorithms are the path-based method of Werner [12] or the iterated local search (ILS) of Stutzle [13].Recently, Rajendran and Ziegler [14] have proposed two very effective ant colony optimization (ACO) algorithms and Grabowski and Wodecki [15] a very fast TS approach.Ruiz and Maroto [16] give an updated and comprehensive review of flow shop heuristics and metaheuristics.Another recent review is given by Soewanda et al. [17] compared ant colony and combination of genetic algorithm and tabu search for solving flow shop scheduling problem.Nagano et al. [18] proposed a hybrid genetic algorithm, which combines genetic algorithm and NEH algorithm to solve similar problems.Ruiz et al. [19] proposed a robust genetic algorithm that considered some new genetic operators, population initialization, and generation on new population.In association with [20], Ruiz also proposed a robust hybrid-genetic algorithm.The paper discusses the comparison between ant colony, hybrid genetic, and robust-hybrid genetic algorithm.This paper focuses on the use of genetic algorithms (GA).A genetic algorithm is a computerized iterative search optimization technique.It is based on the mechanics of natural selection and natural genetics that deals with populations of solutions rather than with a single solution.This type of algorithm provides near optimal schedules.The optimum value depends on the operators like crossover, mutation, number of iterations (generations), and so forth.In every generation, a new set of artificial individuals (strings) is created.Evolution of chromosomes over generations is by survival of the fittest.GA searches a problem space with a population of chromosomes and selects chromosomes for a continued search based on their performance.Each chromosome is decoded to form a solution in the problem space in the context of optimization problems.Genetic operators are applied to high performance structures (parents) in order to generate potentially fitter new structures (offsprings).Therefore, good performers propagate through the population from one generation to the next [20].Holland [21] presented a basic GA called "simple genetic algorithm" in his studies that is described as in Algorithm 1.
A GA contains the following major ingredients: parameter setting, representation of a chromosome, initial population and population size, selection of parents, genetic operations, and a termination criterion.
This paper describes a makespan optimization scheme using a genetic algorithm with improved initial population as a structure for the suitable scheduling of orders (jobs) in a Gari processing firm.The makespan optimization scheme using GA has a number of new features compared to a traditional genetic algorithm.These include the generation and evolution of a 2-dimensional array of improved initial population size by incorporation of heuristics into initialization to generate well-adapted initial population.In this way, the genetic algorithm (GA) with elitism can guarantee to do no worse than the conventional heuristic does.In this paper, the initial populations are generated using traditional dispatching rules, with corresponding string numbers and string lengths analogous to the number of workstations and work orders.A hybrid of elitism and roulette wheel algorithms was used in the selection process based on job completion times decoded from chromosomes.The reproductive operators are the traditional crossover and mutation operators and a custom-made remedial operator.The performance of the formulated scheme using GA for the same test problem is then compared with some other sequencing and dispatching rules as well as GA scheme with freely generated initial population.A 20-job test problem was simulated through the integral 5-process Gari production routine.
The rest of this paper is organized as follows: Section 2 discusses the Gari processing system and scheduling framework, Section 3 presents detailed descriptions of the proposed method and algorithm, and Section 4 presents the data analysis.In Section 4, extensive comparison of the proposed method is undertaken with some other sequencing and dispatching rules.
Finally, in Section 5 some conclusions from the study are provided along with some future research directions.

The Gari Processing System
The Gari processing plant flowchart consists of the basic processes of peeling, grating, pressing, sieving, and roasting/drying (Figure 1).The firm that processes the cassava has one of each of these stage-wide machines.Thus, customers from all over the city arrive at the Gari processing firm at different periods of the day, with varying amounts of fresh cassava roots for processing.
In order to schedule the processing of customers' orders such that maximum profit is obtained, the principles guiding flow shop scheduling are adopted in which the cassava processing plant is considered as a 5-machine flow shop system where customers are free to bring their jobs at any time.However, each customer's order (fresh roots) passes through the machines in the same order.Since different quantities are brought for processing and the fresh roots have the same surface area characteristics, each order requires different amounts of processing time.In the current work, the unit of measurement is hour.Since test methods that could handle large numbers of orders are proposed here, the test problem considers a case of 20 customers.This corresponds to 20 individual jobs.The concern is in what way should the customers' orders be processed such that the firms profit is maximized?In order to solve this problem, the study considers a common measure of performance utilized in the flow shop scheduling literature-the makespan.The principle here is to monitor the completion time of the last scheduled customer's order.Much work has been done in understanding this principle, as stated by [22].There is much support for the use of the makespan approach as a measure of performance as evidenced by its extensive use in maximizing production rates and minimizing the mean idle time of machines [23].

Methodology
In this study, a GA scheme initialized by using job sequencing heuristics and dispatching rules of FCFS, SPT, LPT, and modified Johnson's algorithm to generate well adapted initial population for Gari processing job scheduling with makespan as the criterion is used.Before discussing how this improved GA is realized, the sequencing and dispatching rules used to improve the GA's initial population are discussed.

First Come First Served (FCFS)
Rule.Using this rule, jobs are scheduled as they arrive for production.This is the traditional rule for scheduling.

Shortest Processing Time (SPT)
Rule.Jobs are scheduled with this rule by sequencing them in ascending order of job processing times per process.The SPT scheme is depicted by Algorithm 2.

Longest Processing Time (LPT)
Rule.Jobs are scheduled with this rule by sequencing them in descending order of job processing times per process.
Step 0. Read {  } = the set of processing job times for the th process and {  } = the corresponding set of job numbers (1 ≤  ≤ ) , ( To realize the equivalent longest processing time (LPT) algorithm, the premise of the conditional statement in Step 4 (If

Modified Johnson's m-Machine Algorithm.
Johnson's method is defined optimally for two machines.However, since it is optimal, and easy to compute, it is adaptable for 2 machines ( = 2).The idea is as follows: imagine that each job requires m operations in sequence, on  1 ,  2 ,. ..,   .We combine the first m/2 machines into an (imaginary) machining center, MC1, and the remaining machines into a machining center, MC2.Then the total processing time for a part on MC1 is equal to the sum of operation times on the first m/2 machines, and processing time for part on MC2 is equal to operation times on the last m/2 machines.By doing so, the m-Machine problem is reduced to a two-machiningcenter scheduling problem.The modified Johnson's algorithm is depicted in the immediate Algorithm 3.

Parameters Setting.
The parameters in GA comprise the population size, number of generations, crossover probability, and mutation probability.These parameters are used as fixed values throughout the genetic evolution until the termination criterion is attained.
3.6.Encoding.In GA, each solution is usually encoded as a bit string.During the past years, many encoding methods have been proposed for scheduling problem [24].Among various kinds of encoding methods, job-based encoding, machinebased encoding, and operation-based encoding methods are most often used for scheduling problems.This study adopts job-/machine-based encoding method.Each gene uniquely indicates an operation and can be determined according to its order of occurrence in the sequence.Let   denote the th operation of job .The chromosomes can be translated into a unique list of ordered operations typified by the concatenated strings making up individual chromosomes as shown in Table 1.In GAs, these parameters are subjectively specified to fit the characteristics of the problem on hand and standard tested parameter ranges that research has established.
Step 1. Create a 2-dimensional array [250, 100] using random variations of the solution obtained with any of the traditional job sequencing and dispatching rules mentioned above (see Table 2).
Step 2. Using the corresponding job times for the job numbers in each byte position of each of the 5 strings in each chromosome, the makespan corresponding to each chromosome is computed according to the sequence in the chromosome.
Step 3. Arrange the reciprocal of the makespan (representing the fitness) obtained for the individual chromosomes of the initial population in descending order of magnitude.
Step 4. Obtain corresponding cumulative fitness value for each chromosome as ordered in Step 3. Step 5. Normalize the values of the cumulative fitness obtained and create class boundaries with them for roulette wheel selection.

Selection.
At this stage selection of chromosomes that qualify as parents is made using a combination of roulette wheel selection and elitism.
Step 1 (selection by elitism).The chromosomes are classified into five distinctive groups of 50 from each of which the first and the last chromosomes in the ordered class are selected automatically to the next generation.Under Scheme 10 chromosomes are selected into the next generation.
Step 2 (selection using roulette wheel).Randomly generate pairs of integer numbers between 1 and 250 to determine parents for the genetic operation of reproduction.This scheme selects the balance of 240 chromosomes into the next generation.

Crossover.
Crossover is an operation to generate new strings (i.e., children strings) from two parent strings.It is the main reproduction operator of GA.During the past years, various crossover operators had been proposed [25].The work showed that the two-point crossover is effective for flow shop problems.Hence the two-point crossover method is used in this study.
The following steps are undertaken to execute the crossover operation.
Step 1. Generate a random number between 0 and 1 and compare it to the specified crossover probability.If the generated number is less than or equal to the specified probability, crossover is permitted; otherwise the pair goes by into the next generation.
Step 2. To determine the string where crossover will occur, generate random integers between 1 and 5; the numbers generated represent the strings where crossover will take place.
Step 3. To determine crossover positions, generate a random integer between 2 and 19 twice.And exchange the content of chromosome on each corresponding position.

Mutation.
Mutation is another operator of GA.Such an operation can be viewed as a transition from a current solution to its neighborhood solution in a local search algorithm.It is used to prevent premature convergence and fall into local optimum.To determine mutation parents, generate two numbers independently between 1 and 250.
Step 1. Generate a random number between 0 and 1 to ascertain whether a parent will be mutated.If the number is less than the specified mutation probability, mutation is permitted; otherwise the pair of chromosomes selected goes by into the next generation.
Step 2. To determine the string where mutation will occur, generate a random integer between 1 and 5; the number generated represents the string where mutation will take place.
Step 3. To determine mutation position, generate a random integer between 1 and 20 and exchange the contents of the two chromosomes at the mutation position.

Remedial Operator.
Remedial operation is necessitated after using the traditional GA operators as some numbers representing the jobs in strings are repeated.Hence, some between the ranges of 1 and 20 in which each number should occur once will be omitted.Thus, there is a need to rectify the situation.The scheme for remedial operator is explained below.
(1) Repetitions Step 1. Search each string to determine if a value is repeated.
Step 2. Store the number of times it is repeated by indexing.Step 3. Generate a random integer between 1 and the number of times the number is repeated.
Step 4. The value generated represents the index of the repeated value that will be retained. (

2) Omission
Step 1. Search each string to determine if a point value is omitted.
Step 2. Store the numbers omitted by indexing.
Step 3. Generate the numbers that are omitted randomly.The value represents the omitted value that a vacant point will be assigned.
Step 4. Generate the index of the vacant position randomly.The value generated represents the point where the value generated in Step 3 will be assigned.
3.7.6.Evaluation.The makespan will be evaluated, and the corresponding value will be stored.The process will be repeated.If a better optimal solution is obtained, it will replace the penultimate optimal value.3.7.7.Termination.GA continues to process the above procedure until attaining the stop criterion set by user.The commonly used criteria are as follows: (1) the number of executed generations; (2) a particular object; and (3) the homogeneity of population.This study uses 500 generations to serve as the termination condition.

Test Problem.
A 20-job test problem was simulated through the integral 5-process Gari production routine.As exemplified in our previous discussion the goal is to determine appropriate job order that will guarantee productivity and time response to customers order without delay.In the real world jobs are described by arrival time, processing time, and due-dates.In this work only arrival time and processing times of jobs (for peeling processing time-PPT, grating processing time-GPT, pressing processing time-PRPT, sieving processing time-SPT, and frying processing time-FrPT) were considered.Table 3 shows the job data.
3.9.Method of Solution.All algorithms were coded and run within the MATLAB environment.The test problem was run with a MATLAB 7.5 program using the algorithm to evaluate the makespan for each of the chromosomes.The hardware used for running the M-file programme is a 1.67 GHz Intel Core Duo CPU T2300 with 512 MB memory.

Methods of Analysis.
The results obtained by using different sequencing and dispatching rules to generate initial population for the genetic algorithm optimization scheme are compared with the results from the GA scheme using freely generated initial population and those using the traditional sequencing and dispatching rules of shortest processing time, longest processing time, first come first served, and the modified Johnson's algorithm.The bases of comparison of results are the optimal mean-makespans obtained from programme execution times for the various options.
The Mean-Makespan.The makespan for a set of jobs under consideration is the total time durations through all the five integral production processes of all jobs received within a time horizon in the Gari processing job shop.The meanmakespan is thus the quotient of the makespan and the number of jobs processed.The optimal mean-makespan is functional defined as where  min  is the minimum time (span or duration) for job  to be completed through the five production processes.
The  min  is also equivalent to the minimum difference between the time point of completion of job  and the arrival time point in the job shop.Assuming there is no breakdown within the system of production, this span is dependent on the sequence of jobs at each stage or process of production.For the traditional sequence and dispatching rules the sequences are well defined.For the new approach used in this work, an improved solution with sequences evolving through genetic algorithm optimality process is sought.
Execution Time.As a basis for practical implementation, the time of execution of a sequence of processes of the sort used in this work is paramount.If a particular scheme is to be more beneficial than any other to industrial practitioners who are time-conscious, it must be executable faster.This informs the comparison of the execution time for the different options of scheduling used in this work.
Population Variation within the Genetic Algorithm (GA) Scheme.Because of the subjective nature of choice of GA parameters, this work attempted to vary the population sizes of the genetic algorithm scheme used and compare both the mean-makespans and execution time for varying GA populations.

Results and Analysis
Table 4 clearly depicts the result obtained when the test problem was addressed using freely generated initial GA population.Table 5 indicates the result obtained using the four traditional job sequencing and dispatching methods as elucidated in Section 3. A mixed breed of the results obtained as in Table 5 indicates that the various job orders constitute the family initial population.The results obtained for running the algorithm with improved 2-dimensional array initial GA population that were generated from the traditional job sequencing and dispatching rules such as FCFS, SPT, LPT, and modified Johnson's algorithm are shown in Table 6.
From Tables 4-6 it can be deduced that when the algorithm was tested with an improved initial GA population a better optimal makespan was realized as against the freely generated initial GA population.Generally, it can be stated that when the algorithm was tested with improved initial GA population a large difference was observed in the execution time and optimal makespan.Graphical illustrations of the above analysis are shown in Figures 2-4 utilizing the data in Tables 4-6.
Figure 2 shows that when heuristics is incorporated in the generation of initial population using traditional dispatching rule, a better optimal makespan is obtained.The result also indicates that as the size of the initial population increases the corresponding execution time also increases but the execution time obtained with a freely generated initial GA population is quite high when compared with the execution time obtained using the improved initial GA population.
From Figure 3, by comparing optimal makespans obtained, better optimal makespans are obtained with improved initial GA population than with randomly (freely) generated initial population.A better optimal makespan is obtained with improved initial GA population.Both plots  are downward sloping from left to right and the population size increases while a better optimal makespan is obtained.
The improved initial population proves to be much more effective.
It can be deduced from Figures 4 and 5 that a better optimal makespan is obtained with the improved initial GA population.
Table 7 shows the improvement rate in optimal makespan when the algorithm was run with improved initial population over freely generated initial population; a remarkable improvement was observed when the population size was of 250 chromosomes.The average improvement rate of improved initial population over freely generated initial population is 0.360.This further confirms the superiority of the method over the other methods considered in this work.

Conclusion
This study developed a makespan optimization scheme using genetic algorithm with improved initial population as a structure for the suitable scheduling of orders (jobs) in a Gari processing firm using initial GA population based on job sequencing and dispatching rules.A number of new features were incorporated into the improved initial GA population.Several yardsticks were used for quantifying the performance of the scheme; the computed results are then compared when the same test problem is addressed using a freely generated GA initial population and four traditional job dispatching rules as follows: first come first served (FCFS), shortest processing time (SPT), longest processing time (LPT), and modified Johnson's algorithm.Analyses based on the computed results show that the method can propitiously improve the results obtained by using freely generated initial GA population and traditional job sequencing and dispatching rules.A better optimal makespan is obtained with the improved initial GA population and as the size of the initial population increases, the corresponding execution time increases; the execution time obtained with a freely generated initial GA population is quite high when compared with the execution  time obtained using the improved initial GA population.In conclusion, GA provides a variety of options and parameter settings which still have to be fully investigated.This study has demonstrated that scheduling of orders (jobs) in a Gari processing firm can be addressed by means of GA and suggests that such procedures are well worth exploring in the context of solving large and difficult combinatorial problem which may be of interest to industrial practitioners and academic researchers in the field of evolutionary computing and machine scheduling.

3. 7 .
The Algorithm 3.7.1.Generation of Initial Population.The initial population sets are generated using four traditional job dispatching rules: first come first served (FCFS), shortest processing time (SPT), longest processing time (LPT), and modified Johnson's algorithm.The following are the GA parameters used: population size = 250 number of strings = 5 (corresponding to the five stages of Gari processing) individual string length = 20 bytes (corresponding to the number of jobs used for the test problem here) crossover probability = 0.6 mutation probability = 0.05 number of generations = 500.

Figure 4 :
Figure 4: Mean makespan against population (with freely generated initial GA population).

Table 2 :
Typical Array of Job Orders in Chromosomes.

Table 3 :
Data on Gari processing job flow.

Table 4 :
Optimal makespan running the GA without improved initial population.

Table 5 :
Optimal makespan with traditional job sequencing and dispatching rules.

Table 6 :
Optimal makespan for running the GA with improved initial population.

Table 7 :
Improvement Rate of Improved GA over Freely generated GA.