A Decomposition-Based Two-Stage Optimization Algorithm for Single Machine Scheduling Problems with Deteriorating Jobs

This paper studies a production scheduling problem with deteriorating jobs, which frequently arises in contemporary manufacturing environments.The objective is to find an optimal sequence of the set of jobs tominimize the total weighted tardiness, which is an indicator of service quality. The problem belongs to the class of NP-hard. When the number of jobs increases, the computational time required by an optimization algorithm to solve the problem will increase exponentially. To tackle large-scale problems efficiently, a two-stage method is presented in this paper. We partition the set of jobs into a few subsets by applying a neural network approach and thereby transform the large-scale problem into a series of small-scale problems. Then, we employ an improved metaheuristic algorithm (called GTS) which combines genetic algorithm with tabu search to find the solution for each subproblem. Finally, we integrate the obtained sequences for each subset of jobs and produce the final complete solution by enumeration. A fair comparison has been made between the two-stage method and the GTS without decomposition, and the experimental results show that the solution quality of the two-stagemethod ismuch better than that ofGTS for large-scale problems.


Introduction
The problem of scheduling jobs on a single machine to minimize total weighted tardiness is extensively studied by many researchers, which also occurs as a subproblem in other scheduling environments such as job shops.The classic scheduling models routinely assume that the job processing times are known and fixed, which however may not be satisfied in many real-world situations.For example, the deterioration effect is known as the fact that a job may need longer processing time if its starting time is postponed.This paper considers the single machine total weighted tardiness problem with proportional deterioration, which is common in steel manufacturing, plastic processing and medical treatments, and so forth [1].
Lawler [2] and Lenstra et al. [3] have shown that the total weighted tardiness problem is NP-hard.Congram et al. [4] proposed an effective local search method for this kind of problem.Bozejko et al. [5] presented a fast local search procedure based on a tabu search approach which employs blocks of jobs and compound moves and applied it to the total weighted tardiness problem.J. N. D. Gupta and S. K. Gupta [6] and Browne and Yechiali [7] first introduced deterioration into scheduling problems.Recently, there have been growing interests in studying scheduling problems with deteriorating jobs [8].Bachman and Janiak [9] considered the problems of minimizing maximum lateness under linear deterioration which is NP-hard and presented two heuristic algorithms.Bachman et al. [10] dealt with the single machine scheduling problem with start time dependent job processing times.Cheng et al. [11] introduced a class of machine scheduling problems in which the processing time of a task is dependent on its starting time in a schedule.Hsu and Lin [12] designed a branch-and-bound algorithm for deriving exact solutions according to several properties concerning dominance relations and lower bounds for the single machine problem with deteriorating jobs to minimize the maximum lateness.Wang et al. [13] considered single machine scheduling problems where the processing time is the increasing function of their starting times and the jobs are related by a series-parallel graph.More recent contributions in this line of research can be referred to [14][15][16][17][18], where integrated scheduling problems with deteriorating jobs are studied.
In terms of general single machine scheduling, many different solution approaches have been proposed in the existing literature.Sels and Vanhoucke [19] developed a hybrid dualpopulation genetic algorithm for the single machine maximum lateness problem which took some specific characteristics into account.Their work has an important implication on the balance between intensification and diversification in the design of search algorithms.Voutsinas and Pappis [20] proposed a branch-and-bound algorithm which uses the suboptimal solution of a heuristic as initial solution for solving the single machine scheduling problem with deteriorating jobs.Jolai et al. [21] focused on the bicriteria scheduling problem of minimizing the number of tardy jobs and maximum earliness for single machine scheduling without allowing idle times.Pakzad-Moghaddam et al. [22] presented a mixed-integer mathematical programming model for a single machine scheduling problem with deteriorating and learning effects.Xu et al. [23] solved the single machine scheduling problem with sequence-dependent setup times and conducted a systematic comparison of hybrid evolutionary algorithms (HEAs), which independently used the six combinations of three crossover operators and two population updating strategies.More recent works that deal with advanced single machine scheduling problems can be found in [24][25][26][27][28].
This paper considers the single machine total weighted tardiness problem with proportional deterioration.Our objective is to minimize the total weighted tardiness of jobs.The problem belongs to the class of NP-hard problems.With the increase of the job number, the time an algorithm consumes to solve the problem increases exponentially.Hence, a two-stage method based on decomposition is presented.First, we perform partition using neural networks to transfer the large-scale problem into smaller-scale subproblems.Then, we find the solution for each subproblem by a hybrid metaheuristic algorithm.Indeed, we employ an improved genetic algorithm, called genetic tabu search (GTS), which uses tabu search (TS) as the mutation operator to increase the performance of local search.Genetic algorithm (GA) is an optimization method based on the principles of genetics and natural selection.However, it is weak in local search and has premature convergence, so we combine it with a tabu search approach to enhance the algorithm.Finally, we combine the obtained sequence for each partition of jobs and give the final solution by means of enumeration.
This paper is organized as follows.In Section 2, the basic definitions and notations of the problem are presented.Section 3 describes the detailed algorithms in the two-stage method, including the neural network, the genetic algorithm, and tabu search.Computational results are shown in Section 4. Section 5 presents the conclusion.

Problem Description
We consider the problem of scheduling a set of  jobs on a single machine with the following assumptions: (i) All jobs are ready at time zero.(ii) The machine can handle only one job at a time.
(iii) There are no precedence relations between jobs and preemption is not allowed.
(iv) Setup times are not explicitly considered.
(v) The basic processing times, the latest starting times, and the deterioration factor are known.
The single machine total weighted tardiness problem addressed in this paper is affected by the deterioration effect, which is a phenomenon commonly observed in many cases.For example, in the process of steel-making, as the temperature of an ingot drops below a specified level, it must be heated again to the temperature required for rolling.
The problem can be described as follows.There is a set  = {1, 2, . . ., } of  jobs (index ,  = 1, 2, . . ., ) that have to be scheduled on a single machine.Each job  is characterized by a basic processing time   , a latest starting time   , a weight   , and a due date   .For a given sequence of jobs, the (earliest) completion time   , tardiness   = max{0,   −   }, and cost   (  ) =     of job  ∈  can be computed.The objective is to find a job sequence in which the given set of jobs are scheduled such that the total weighted tardiness ∑  =1   (  ) = ∑  =1     with respect to the given due dates is minimized.
Besides, the actual processing time of each job is a nondecreasing function of its starting time.When the starting time of job  (  ) is earlier than or equal to the latest starting time   which is given, the actual processing time   equals the basic processing time   .Otherwise,   is a function of   as defined below: where   is the deterioration factor.
The notations that will be used throughout this paper are summarized in Notation section.

A Two-Stage Solution Method
Based on Decomposition 3.1.Methodology Overview.To tackle complex optimization problems efficiently, the utilization of problem-specific properties is highly important in the design of optimization algorithms.As for the problem studied in this paper, since each job has several features including a basic processing time   , a latest starting time   , a weight   , a due date   , and a deterioration factor   , the distributions of these data tend to have certain characteristics which can be extracted and utilized as special information to guide the search behavior of optimization algorithms.To improve the time efficiency of the optimization process, we transform the original largescale scheduling problem into several small-scale scheduling problems based on the features of each job using a backpropagation (BP) neural network.Firstly, we get the optimal solutions of some small-scale problems for training the neural network.After establishing the neural network which can roughly predict the position of each job, we divide the original set of jobs into several (1) Input:  (2) Output: the best job sequence found so far (3) Population Initialization (4) Repeat (5) Randomly choose two individuals from the current population ( 6) Crossover (7) Mutation (tabu search) (8) Fitness Evaluation ( 9) Selection (10) Until the stop criterion is met Algorithm 1: Pseudocode of the GTS.subsets, each corresponding to a scheduling subproblem.For example, if we are to have 4 subproblems, then we could use the neural network to predict which jobs will belong to the first 1/4 in the final schedule (and thus will be placed into the first subset), which will belong to the next 1/4 (and thus will be placed into the second subset), and so forth.Then, we use the proposed GTS to obtain the solution for each subproblem and combine them to yield the final solution.

The First Stage.
The BP neural network is a typical artificial neural network that can approximate complex nonlinear mapping functions.It has been widely adopted in application areas like classification, fitting, and compression.A typical BP neural network consists of three layers including an input layer, a hidden layer, and an output layer.The number of nodes in the input layer depends on the dimension of the input vector.In this paper, each job has five features including a basic processing time   , a latest starting time   , a weight   , a due date   and a deterioration factor   .Generally, a single hidden layer can realize the arbitrary nonlinear mapping by increasing the number of neuron nodes appropriately.The number of nodes in the hidden layer, which can adjust the accuracy of the neural network, affects the training results and the training time.With an increase of the node number, the training results improve while the training time increases correspondingly.The number of neurons in the output layer is decided according to the practical problem, and, in this study, it is equal to the number of subproblems we are planning to divide into.The Quasi-Newton method, which is a fast optimization technique based on Taylor series expansion, is applied to train the neural network.
The network is trained based on sample data with known output derived from small-scale problem instances.The data consist of the information of each job with the above features and the sequential index of the subset it belongs to.The data is grouped into two parts, the training data for training the neural network and the test data for testing the correctness of the classification.

The Second Stage.
Although GA can be directly applied to complex combinatorial optimization problems, each generation of the algorithm must maintain a large population size.With the expansion of the problem size, the computational time needed will increase dramatically.Besides, GA usually converges prematurely, which is mainly caused by a lack of diversity in the population.In addition, the mutation operator is inadequate for a systematic local search.Compared with GA, TS has faster convergence rate.However, the search performance of TS greatly depends on the initial solution.
Population-based GA and single-trajectory TS have complementary characteristics.GA explores well the search space while TS intensifies the search in promising regions.According to the strengths and weaknesses of these two algorithms, we apply TS to replace the mutation operator in GA.
We present the general framework of the GTS in Algorithm 1. GTS starts from an initial random population (line 3).Then, the crossover operator is employed to generate new offspring solutions (line 6).Besides, the mutation operator is implemented with TS to enhance the local search performance (line 7).Subsequently, the population updating rule decides whether such a mutated solution should be inserted into the population and which existing individual should be replaced (line 9).

Encoding and Decoding.
A chromosome is represented by a sequence of jobs and each gene is denoted by the index of a job.For example, chromosome  = (3, 7, 1, 10, 4, 8, 2, 9, 6, 5) means that job 3 is processed first and job 7 comes second and so on.

Initial Population.
Usually, we use specific heuristic rules combined with random methods to generate the initial population.However, since in this study the GTS is only used to solve each subproblem with limited size, the individuals of the initial population are generated randomly to encourage solution diversity.

Fitness and Selection.
Here the fitness is calculated simply according to the formula of the total weighted tardiness.We select better offspring based on their fitness and the individual whose fitness is better will be selected with higher probability.

Crossover Operator.
Crossover is used to generate new offspring by recombining the selected parents, so it is the key operator in GA.There are several crossover methods for combinatorial problems including linear order crossover (LOX), position-based crossover (PBX), and partially mapped crossover (PMX).We employ the LOX method in GTS.

Tabu Search Based Mutation.
The mutation operator is reimplemented with a tabu search algorithm.Here, we take the solution obtained by GA as the initial solution for the TS.The neighborhood structure is defined as swap and we can obtain a neighborhood solution when we choose two different genes randomly and then exchange their locations.Consequently, the two genes which have been exchanged are recorded as the element in the tabu list.Besides, when the new solution is better than the best-so-far solution, we accept it no matter whether the move exits in the tabu list.The procedure of the tabu search is presented in Algorithm 2.
The fundamental idea underlying tabu search is to avoid repeated search in the same area of the solution space.To this end, tabu search involves some essential concepts such as tabu list, tabu length, and aspiration criterion.To make a clearer description of the tabu search module, we provide a specific example detailed as follows.
Suppose that we are dealing with a sequencing problem in which a solution is represented by a permutation of {1, 2, . . ., 7}.The initial solution is (2, 5, 7, 3, 4, 6, 1), with an objective value of 10.The tabu list is initialized as an empty set (i.e.,  = 0).The tabu length is set as 3.The neighborhood solutions are generated by the swap operator which exchanges two elements in the current permutation.In the first iteration, the 5 best candidate solutions in the neighborhood are first identified (they are listed in Figure 1 in terms of the elements to be swapped).The best one, (5, 4), is selected, and the new solution is obtained by swapping 5 and 4. Now, the pair (5, 4) is put into the tabu list (i.e.,  = {(5, 4; 3)}), which means the swap of 5 and 4 is not allowed in the subsequent 3 iterations (unless it leads to a new best solution).In the second iteration, the current solution has an objective value of 4, and, also, the 5 best candidate solutions are identified.The best one, (3, 1), is chosen, and then the pair (3, 1) must be put into the tabu list.The updated tabu list is therefore  = {(5, 4; 2), (3, 1; 3)}.In the third iteration, it is found that all the neighborhood solutions are inferior to the current solution (with an objective value of 2).However, according to the tabu search principles, the best solution that is not tabooed should be selected as the new solution.Therefore, (2, 4) is

Computational Experiments
Because the problem has not been investigated in the literature, there is no standard data set available.Nevertheless, the literature on the single machine scheduling problem with deteriorating jobs without considering the weight of the jobs provides us with several data generation techniques.The basic processing times   are uniformly distributed integers between 1 and 100.The deterioration factors are uniformly distributed in the range (0, 1) and the latest starting times are 30 × (0, 1), where  is the number of jobs.The parameters are summarized in Table 1.
In order to refine the two-stage method, the number of subproblems, the tabu length, and the mutation probability are chosen to be fine-tuned by computational tests.In each cycle, a single parameter was chosen to be fine-tuned, while the other parameters were set to their recommend values.

Number of Subproblems.
The number of subproblems that the original problem is divided into can affect the solution quality, because the optimization accuracy will be decreased if the number is too large, and there will be little effect on reducing the computational complexity if the number is too small.For convenience, the number of subproblems is tested at integer values which exactly divide the number of jobs.Here, ten different instances with 100 jobs are tested, and the number of subproblems is set as 2, 4, and 5, respectively.Each test is repeated 30 times to reduce random errors.
We use ten randomly generated instances of 100 jobs and divided the jobs into 2 subsets (each with 50 jobs),

Candidate solutions
Candidate solutions

Candidate solutions Candidate solutions
Candidate solutions Swap ( 4 subsets (each with 25 jobs), and 5 subsets (each with 20 jobs) in the three experiments.We obtain the solution for each subproblem and then combine the solutions to obtain the final result.The computational time allowed for each instance is equal and here we set the time limit as 1 minute.Based on the averaged total weighted tardiness, we can judge the best number of subproblems.An example under the setting of 2 subproblems is described as follows.First, we divide the given jobs into 2 subsets of 50 jobs and record the top 20 solutions for each subproblem after applying the GTS within the time limit of 30 seconds.Next, we combine the solutions by a full enumeration.A total of 400 (20 × 20) combinations are considered and the best one is selected.We assume the time needed to validate the combined solutions is negligible compared to the time consumed by GTS.
In Table 2, the test results are given in terms of the averaged total weighted tardiness.Overall, when the number of subproblems is set to 4, the final solution quality is the best.

Tabu Length.
In the TS module, the tabu length has great influence on the search speed and the solution quality.If the length of tabu list is too small, repeated search may frequently happen.On the contrary, it will exclude the search It is obvious from Table 3 that the results under tabu length 5 are much better than the others.Consequently, the tabu length is determined as 5. Table 5 shows the computational results of the two-stage method and GTS in terms of the mean objective values.
It is noted that when the number of jobs is small the two-stage method sometimes performs not as well as GTS.For example, in the 10 instances of 100 jobs, there are 3 instances for which the proposed method performs even worse than the GTS without job partition.This is because the relative positions of some jobs will be fixed after applying the partition and thus the search domain of the optimization algorithm will be limited.The partition of jobs is conducted based on the selected features of the jobs and is used to guide the search algorithm.However, to some extent, it limits the search capability of the algorithm in some directions.Thus, we believe that the effects of job partition can be divided into two parts including a positive guiding effect and a negative limiting effect.When the number of jobs is small, the positive effect of guiding may have less influence than the negative effect of limiting the search.However, with the increase of jobs, the positive effect becomes overwhelming, and the twostage method outperforms the GTS in a very considerable manner, which reveals the valuable contribution of the job partition methodology.

Conclusion
This paper studies the single machine total weighted tardiness scheduling problem with deterioration.Deterioration is related to the processing time and starting time of jobs.When the starting time of a job is later than the latest starting time which is given, the processing time will increase.It is common in most just-in-time production environments with human participation.
We propose a hybrid algorithm called GTS which combines GA and TS to solve the problem.In order to tackle large-scale instances, we conduct job partition by means of a neural network approach and then find the solution for each partition by GTS.Finally, we combine the job sequence for each partition and give the final solution by enumeration.We named the GTS with partition the two-stage method.To demonstrate whether the two-stage method performs better than GTS for large-scale problems, we use 40 instances for the comparison purpose.The average improvement rate is considerable as revealed by the computational results.

5 )
and update the tabu list (Until the stop criterion is met Algorithm 2: Pseudocode of tabu search.

Notation𝑛:
Number of jobs   : Job ,  = 1, 2, . . .,    : Actual processing time of     : Basic processing time of     : Starting time of     : Latest starting time for     : Weight of     : Due date for     : Completion time of     : Deterioration factor of   .

Table 1 :
Data generation rules.
3.3.6.Stopping Condition.GTS is terminated if the best-sofar solution does not change for 50 consecutive generations or the given execution time is exhausted, whichever comes first.

Table 2 :
Comparison among different numbers of subproblems.

Table 3 :
Comparison among different tabu lengths.

Table 4 :
Comparison among different crossover probabilities. GTS −  2-S )/ 2-S , where  GTS represents the objective value achieved by GTS alone (without decomposition), while  2-S denotes the solution value obtained by the proposed two-stage algorithm.The BP neural network was implemented under Matlab R2013a and GTS was coded in C++ under Microsoft Visual C++ 2010 on an Intel Core i3 CPU and Windows 7 platform.