Diversity Controlling Genetic Algorithm for Order Acceptance and Scheduling Problem

Selection and scheduling are an important topic in production systems. To tackle the order acceptance and scheduling problem on a single machine with release dates, tardiness penalty, and sequence-dependent setup times, in this paper a diversity controlling genetic algorithm (DCGA) is proposed, in which a diversified population is maintained during the whole search process through survival selection considering both the fitness and the diversity of individuals. To measure the similarity between individuals, a modified Hamming distance without considering the unaccepted orders in the chromosome is adopted. The proposed DCGA was validated on 1500 benchmark instances with up to 100 orders. Compared with the state-of-the-art algorithms, the experimental results show that DCGA improves the solution quality obtained significantly, in terms of the deviation from upper bound.


Introduction
In make-to-order production systems with limited capacity and tight delivery requirements, the order acceptance and scheduling problem has been considered as an important topic.Order acceptance decides which orders are to be accepted from all of the orders submitted by customers, while scheduling fixes on the start time for processing each accepted order.Accepting orders without taking into account the capacity and delivery requirements may postpone some of the orders and decrease the revenue.Therefore, joint decisions for acceptance and scheduling have to be made simultaneously in order to maximize the total net revenue.
Diverse problems that combine the decisions of selection and scheduling have been studied by both researchers as well as practitioners over the past two decades.Slotnick [1] summarized variants of acceptance and scheduling problems under different settings and objectives from the literature.In this study, we focus on a static single machine environment, which is the fundamental of other researches in this area.Typical exact algorithms have been applied to solve the problem.For example, Hall and Magazine [2] incorporated Lagrangian relaxation into a dynamic programming without considering sequence-dependent setup times and tardiness penalty.Gordon and Strusevich [3] presented a dynamic programming for similar problem where the processing time of each job depended on its position in the processing sequence.Charnsirisakskul et al. [4,5] put forward a mixed-integer programming method for single machine production system where setup times were negligible.Oguz et al. [6] gave a mixed-integer linear programming formulation of order acceptance and scheduling problem, which can be solved optimally for problems with up to 15 orders.Nobibon and Leus [7] devised two exact branch-and-bound procedures for a generalization of the order acceptance and scheduling problem.Slotnick and Morton [8] developed an optimal branchand-bound procedure using a linear(integer) relaxation for bounding and performed job acceptance and sequencing jointly.However, due to the strongly NP-hard complexity of the problem [6], only small or moderate size order acceptance and scheduling problems can be optimally solved using exact algorithms.Thus, it is crucial to apply advanced approximate algorithms to solve problems in practical applications with large numbers of orders.
In terms of approximate algorithms, Cesaret et al. [9] proposed a tabu search algorithm supported with a probabilistic local search procedure for order acceptance and scheduling problem with an objective of maximizing total revenue minus 2 Mathematical Problems in Engineering tardiness penalty.Lin and Ying [10] presented an artificial bee colony algorithm combined with an iterated greedy heuristic for exactly the same problem.Rom and Slotnick [11] proposed a genetic algorithm complemented with a probabilistic local search for order acceptance and scheduling under the assumption that all orders were released at the same time.Xiao et al. [12] extended the order acceptance problem to the case of permutation flow shop scheduling and put forward a simulated annealing algorithm coupled with partial optimization.Yang and Geunes [13] studied a single resource scheduling problem with job-selection flexibility, tardiness costs, and controllable processing times and proposed a greedy randomized adaptive search procedure.
In our previous work [14], we proposed an improved genetic algorithm with local search (IGAL) for order acceptance and scheduling problem.Genetic algorithm is an effective metaheuristic algorithm for global optimization.However, traditional genetic algorithm often takes comparatively long time to converge to global optima [15,16].Hastily congregating individuals within a small region of the search space lead to premature population convergence [17], which makes the algorithm stuck into local optima.Maintaining a diversified population is an effective way to prevent premature convergence.Diversity can be measured either at the genotype level or the phenotype level [18].Vidal [19,20] characterized each individual in the population by a biased fitness composed of its solution cost as well as its diversity contribution for the vehicle routing problems with time windows.The diversity contribution was defined as the average Hamming distance from one individual to its closest neighbors.For the numerical optimization problems, Mc Ginley [17] introduced standard population diversity (SPD) and healthy population diversity (HPD) to adapt the genetic operators.SPD described the level of solution space diversity while HPD described the level of fitness-weighted diversity.The gene-wise Euclidean distance and fitness-weighted distance are used as the measure of SPD and HPD, respectively.Masisi et al. [21] and Misevičius [22] applied the entropy as genotype diversity measurement.Actually, as Burke et al. [23] has pointed out, the diversity measurements are problem specific.There is no single measurement that fits all problems [18].
In this paper, we intend to study how to improve the IGAL by introducing a diversity controlling mechanism into genetic algorithm.A new diversity controlling genetic algorithm (DCGA), which extends our previous work in [14] by introducing a diversity controlling mechanism, is presented.The diversity measure designed for the order acceptance and scheduling problem is different from the measures mentioned in the last paragraph.Not all the orders in the chromosome are considered for calculating the diversity.We do not take into account the orders that are not accepted.Only the accepted orders which contribute to the fitness in the solution are considered.In the survival selection operator, instead of the commonly used "no fitness duplicates" [24] strategy, we allow for individuals with duplicate fitness only if they are distinct from each other to a certain extent according to the diversity measurement.Based on the diversity controlling mechanism, a healthy diverse population is maintained during the search process.Extensive experiments are conducted on 1500 instances with up to 100 orders.The experimental results illustrate that the performance of genetic algorithm is improved significantly through diversity controlling during the search process.The efficacy of the proposed algorithm is also verified by comparing with several other algorithms from the literature.
The remainder of this paper is organized as follows.Section 2 describes the order acceptance and scheduling problem formally.Section 3 details the proposed diversity controlling genetic algorithm.Section 4 presents the experimental studies.Finally, Section 5 concludes this paper with some remarks and future research directions.

Order Acceptance and Scheduling Problem
The order acceptance and scheduling problem can be described as follows.In a single machine environment, there are  incoming orders to be processed.No precedence constraints exist between the orders.The machine can only process one order at any time without preemption.The machine needs processing time   to process order .Order  is released at time   , after which the order is available for processing.If order  is accepted, the completion time for processing the order should be no later than its deadline   .Otherwise, the customer will not pay for the order for its too late delivery.If order  is accepted and delivered without tardiness, that is to say, the completion time   is no later than the due date   , revenue   is obtained.If tardiness happens to order , the revenue of it decreases by   =   /(  −   ) per unit time delay beyond due date   (  <   ).We denote the revenue gained for accepting order  by   =   ⋅ (  −   ⋅   ), where   is an indicator which equals to 1 if order  is accepted, and 0 otherwise.  = max{0,   −   } is the tardiness of the order.The objective is to find a subset of orders and fix on start processing time of them, which maximize the total net revenue denoted by ∑  =1   .This problem has been proved to be of strongly NP-hard complexity [6].

Diversity Controlling Genetic Algorithm
The proposed diversity controlling genetic algorithm starts with initializing the population to a group of chromosomes, each of which represents a feasible solution.The fitness of each chromosome is calculated to evaluate the quality of the solution.A subset of individuals from the population are selected, two of which are chosen for mating operation (crossover, mutation) to generate offspring.This procedure is repeated until it has reached the maximum number of children in the subpopulation.Then survival selections are performed to generate population for the next generation.The algorithm stops and returns the best feasible solution until it has reached the stopping criterion.Algorithm 1 illustrates the procedure of diversity controlling genetic algorithm.
The population initialization, fitness evaluation, crossover operator, mutation operator, and local search methods are exactly the same as our previous work [14].They are summarized from Section 3.1 to Section 3.5.

Population Initialization.
We adopt the permutation representation as a chromosome for an individual.For a problem with  orders, a sequence of integers ranging from 1 to  represent a solution where each gene corresponds to an order.Random key representation [25] is applied here to generate initial chromosome.We first generate a sequence of  random decimal numbers in (0, 1) and then sort them in ascending order.The position of each decimal number in the original sequence is recorded as an integer, which represents an order.For example, according to this method, a decimal sequence (0.33, 0.92, 0.48, 0.18, 0.76) represents an integer sequence (4, 1, 3, 5, 2).We randomly generate 2 ×  individuals and select best half of them as the initial parent population   .

Fitness Evaluation.
The processing sequence of each order is fixed according to its position in the chromosome.The fitness evaluation for each chromosome is described as follows.
(i) Choose next order in the sequence to evaluate until it has reached the end of the sequence.
(ii) Try to set the start time of chosen order at the earliest possible start time.The order is accepted if the end time does not go beyond the deadline and rejected otherwise.If the chosen order is accepted then the revenue gained by accepting it is recorded.In a situation where order  is accepted and order  is the succeeding order to be evaluated, the earliest possible start time of order  is max{  ,   } +   ;   is the end time of order .
(iii) The revenue of each accepted order adds to the total net revenue of the solution.

Crossover.
A crossover operator recombines the genes of two chromosomes and generates offspring that inherit part of the characters of the parents.A diversity of crossover operators have been reviewed by Potts et al. [26] and by Poon and Carter [27].In this study, we apply the same-sitecopy-first principle [28], which was proposed to solve the production scheduling problem at first, to perform crossover  operation in order acceptance and scheduling problem.The crossover procedure is described as follows.
(i) Any gene code which takes up the same position of both parents is also fixed on that position of the offspring.(ii) The remaining positions in the offspring are assigned by the order of all gene codes in Parent 1 within the sequence bounded by two randomly selected points.(iii) The remaining unassigned positions are placed in the order of appearance in Parent 2.
The crossover with same-site-copy-first principle is illustrated in Figure 1.

Mutation.
Mutation is adopted to prevent the newly formed children from being trapped into their particular local optima.Several mutation operators have been invented for permutation problems, such as adjacent two-change, arbitrary two-change, arbitrary three-change, and shift-change [29].Shift-change is applied to the mutation operation here because it changes the actual positions of some gene codes to obtain diversity while keeps the relative positions of some gene codes to inherit gene traits of parents.For a shift-change procedure, select two positions randomly and replace one selected position with the other one.Then shift all positions within the sequence bounded by the two selected positions.The shift-change mutation procedure is illustrated in Figure 2.   [30].The iterated greedy procedure applies the destruction phase and construction phase iteratively.The destruction phase eliminates some orders from the incumbent solution while the construction phase reinserts the eliminated orders sequentially into the sequence.The iterated greedy procedure is also applied by Lin and Ying [10].For detailed descriptions and illustrations of the procedure, the reader is referred to the work of Ruiz and Stützle [30] and the work of Lin and Ying [10].

Diversity Controlling.
Comparing with our previous work in [14], the algorithm proposed in this paper adopts a diversity controlling mechanism in the search process.An improved distance measurement is proposed to measure the difference between individuals.In the parent selection procedure, we take into account not only the fitness of individuals but also the diversity between them.In the survival selection procedure, we select a diversity of individuals which are distinct from each other even though they might share same fitness values.

Difference Measurement.
The Hamming distance is a frequently used measurement indicating the difference between two individuals.The difference between two individuals represented by a normalized Hamming distance [20] is defined as where [] and V[] are the th and th genes of individual  and V. 1(cond) returns 1 if the condition cond is true, 0, otherwise.However, in order acceptance and scheduling problem, not all the orders represented in chromosomes can be accepted.When calculating the Hamming distance between two individuals, taking into account all the orders, a subset of which are not accepted and scheduled, might not reflect the real difference between them.For example, Figure 3 illustrates two individuals composed of 10 orders.The integer in shadow indicates that where According to this new measurement, diff(, ) = 30%, diff(, ) = 20%, which explains that individuals  and  are actually more similar than individuals  and .

Survival Selection.
Survival selection decides which individuals in the subpopulation are to be selected as the parent population in the next generation.The objective of this problem is to maximize the total net revenue of accepted orders, so we sort the individuals in the subpopulation by fitness in descending order.If a subset of individuals that share same fitness exist, we remove some of them that are too similar with each other according to the difference measurement.If the number of remaining individuals in the subpopulation is less than the size of parent population then we randomly generate new individuals together with the remaining individuals in the subpopulation to form father population, else  best fit remaining individuals in the subpopulation are selected as the father population in the next generation.The survival selection procedure is described in Algorithm 2.
To observe the diversity change during the search process, we define the population diversity as where [] and [] are the th and th individuals in population .    Figure 4 illustrates the population diversity change on test dataset Dataslack 100orders Tao9R9 1 after 1500 generations.At the first few generations, the population diversity is relatively small.This is because the population is randomly initialized and only few orders are accepted in the random  generated chromosomes.Figure 5 explains the change of the average number of accepted orders and Figure 6 shows the change of best fitness.We can see that as the search proceeds, the population diversity does not show obvious change.The population after 1500 generations is as diversified as it is when the search just starts.

Parent Selection for
Mating.The Best-Last Mating [31] with similarity preference is applied here.We select  individuals without replacement through tournament selection from parent population.The fittest one in the  individuals is selected as parent  1 .The other individual that is most different from  1 is selected as parent  2 .The selected parents  1 and  2 then undergo crossover and mutation with probabilistic   and   , respectively, to generate new offspring.

Experimental Studies
4.1.Test Instances.The proposed DCGA was evaluated on 1500 benchmark instances with various problem sizes and parameter values.The data sets were designed by Cesaret, and can be found in [9].The instances include three factors: the number of orders (), the tardiness factor (), and the due date range ().The number of orders was set to 10, 15, 20, 25, 50, and 100.The tardiness factor and due date range were both set to 0.1, 0.3, 0.5, 0.7, and 0.9.The procedure to generate the data sets used by Cesaret [9] is described as follows.For each order  ( = 1, 2, . . ., ), a processing time   and a revenue   were randomly generated from the uniform distribution [0, 20].A release date   was generated from [0,   ] where   is the total processing times of all orders.The sequence-dependent setup times   ( = 1, 2, . . ., ,  ̸ = ) were generated randomly from the uniform distribution [0, 10].A due date was generated as   =   + max =0,1,...,   + max{slack,   ), where slack was generated randomly from the uniform distribution [  (1 −  − /2),   (1 −  + /2)].A deadline was generated as   =   +   .A weight was calculated as   =   /(  −   ).All of the generated parameters except   are integers.

Parameter Settings.
Since the main purpose of this paper is to validate the efficacy of diversity controlling in GA for the order acceptance and scheduling problem, we did not apply any sophisticated parameter control strategies.Instead, we used simple parameter study procedure to select the most suitable fixed parameter settings for DCGA manually.Based on our preliminary experiments, the following parameter setting is suggested.The size of father population   and the size of subpopulation   are set to 40 and 120, respectively.Crossover probability and mutation probability are set to 0.8 and 0.2, respectively. is set to 8 and  is set to 0.1.If the best fitness does not improve within 200 consecutive iterations or maximum 1500 iterations are reached, the algorithm stops and returns the best solutions.

Experimental Results and Discussions.
The proposed diversity controlling genetic algorithm (DCGA) is implemented in C# programming language and run on a PC with Intel i5 CPU (2.4 GHz) and 2 GB RAM, Windows 7 OS.The percentage deviation of the total net revenue from upper bound (UB) is used as the performance measure.The deviation from UB is 100% × (UB-feasible solution)/UB.The upper bound for each instance was generated by Oguz et al. [6].Two bounds were generated and the tighter one was selected as the UB.The first bound was calculated by solving MILP using CPLEX with time limit of 1 hour.The second bound was generated by solving the LP relaxation of the MILP with valid inequalities.The deviation from upper bound on 1500 instances is recorded and compared with that of 5 other algorithms.They are MILP, m-ATCS, ISFAN, TS, and ABC.MILP, m-ATCS, and ISFAN are proposed by Oguz et al. [6].m-ATCS refers to modified apparent tardiness cost rulebased heuristic.ISFAN is iterative sequence first-accept next procedure based on simulated annealing.TS is tabu search algorithm proposed by Cesaret et al. [9].ABC refers to artificial bee colony based algorithm proposed by Lin and Ying [10].For small size problems, for example  = 10, we can find the optimality by enumerating all the feasible solutions within 2 seconds.But due to the combinatorial complexity, most of the moderate size and large size problems cannot be solved to optimality by those algorithms.For this reason, the percentage deviations from UB are reported here instead of the optimal solution numbers for each dataset.
Small size problems are very easy to solve.Tables 1 and 2 illustrate the results for small size problems.Problems with 10 orders can be solved to optimality by MILP and most of them can be solved to optimality by ABC.Our proposed DCGA finds all the optimal solutions for  = 10.For problems with 15 orders, the maximum deviations from UB of 2 datasets are improved; the average deviation from UB of 1 dataset is improved.
For moderate size problems, Table 3 shows that for problems with 20 orders, the maximum deviations from upper bound of 4 datasets are improved; the average deviation from upper bound of 1 dataset is improved.The results of problems with 25 orders are given in Table 4.The maximum deviations from upper bound of 4 datasets are improved; the average deviations from upper bound of 2 datasets are improved; and the minimum deviation from upper bound of 1 dataset is improved.
Tables 5 and 6 give the results of large size problems with 50 and 100 orders, respectively.For problems with 50 orders, the maximum deviations from upper bound of 4 datasets are improved; the average deviations from upper bound of 4 datasets are improved; and the minimum deviations from upper bound of 3 datasets are improved.For problems with 100 orders, the maximum deviations from upper bound of 7 datasets are improved, the average deviations from upper bound of 7 datasets are improved, and the minimum deviations from upper bound of 3 datasets are improved.

Conclusions
In this paper, we propose a diversity controlling genetic algorithm for the order acceptance and scheduling problem with tardiness penalties, distinct release dates, and sequencedependent setup times.To measure the difference between individuals, a modified Hamming distance without considering the orders that are not accepted instead of the whole genes in the chromosome was adopted.A diversified population is maintained during the search process through survival selection.By comparing with the state-of-the-art algorithms, the experimental results suggest that the proposed diversity controlling mechanism is effective, and the quality of

Figure 1 :
Figure 1: Illustration of crossover with the same-site-copy-first principle.
[[]] = 1 and [V[]] = 1 indicate that the orders in th gene of individuals  and V are accepted, respectively.
Sort individuals in   by fitness descending; Set   = NULL Initialize a Boolean array needRemove[] = false,  = 1, . . ., ; For  = 1 to  If needRemove[] continue; For  =  + 1 to  If needRemove[] continue; If fitness of individual  > fitness of individual  continue; If diff(indiv , indiv ) <  needRemove[] = true;  =  − 1; If (number of individuals in   <  and needRemove[] = false) Add individual  to   While (number of individuals in   < ) Randomly generate a new individual and add it to   Algorithm 2: Survival selection.

Figure 4 :
Figure 4: Illustration of population diversity change.

Figure 5 :
Figure 5: Illustration of change of average number of accepted orders.
Algorithm DCGA Initialize parent population   ( individuals) and subpopulation   = NULL While number of iterations < It max and number of iterations without improvement < It  Add the  fittest individuals in   into   While the number of individuals in   <  Select a subgroup   ( individuals) without replacement through tournament selection Set the fittest individual in   as a parent  1 and the most different individual from  1 as parent  2 Generate offspring from  1 and  2 (crossover, mutation) Add the generated offspring into   Select  survivors from   and replace all the individuals in   with them Set   = NULL Perform local search on randomly selected  individuals among the 30% fittest individuals in   Return the best feasible solution Algorithm 1: Diversity controlling genetic algorithm.
To improve the generated solutions, genetic algorithms have been usually complemented by local search.Local search can improve current solution by selecting the best from the neighborhood, but it can also add to the computational cost for large size instances.Considering this, we apply two simple but effective local search strategies to improve the solutions generated at each iteration.For the first local search procedure, we successively interchange the positions of two immediate orders in the sequence and record the best interchange as a new solution.For the second local search procedure, we apply the iterated greedy procedure proposed by Ruiz and Stützle 3.5.Local Search.
) = 60%.According to this measurement, it seems that individuals  and  are more similar than individuals  and .But in fact, the accepted orders of individual  (6, 1, 2, 7, 9, 10) and the accepted orders of individual  (6, 7, 1, 2, 9, 10) are less similar than that of individual  and that of individual  (6, 7, 2, 1, 9, 10).We put forward a new difference measurement taking into consideration only the orders that are accepted.It can be defined as diff