Hybrid Differential Evolution Optimisation for Earth Observation Satellite Scheduling with Time-Dependent Earliness-Tardiness Penalties

We study the order acceptance and scheduling (OAS) problem with time-dependent earliness-tardiness penalties in a single agile earth observation satellite environment where orders are defined by their release dates, available processing time windows ranging from earliest start date to deadline, processing times, due dates, sequence-dependent setup times, and revenues. The objective is to maximise total revenue, where the revenue from an order is a piecewise linear function of its earliness and tardiness with reference to its due date. We formulate this problem as a mixed integer linear programming model and develop a novel hybrid differential evolution (DE) algorithm under self-adaptation framework to solve this problem. Compared with classical DE, hybrid DE employs two mutation operations, scaling factor adaptation and crossover probability adaptation. Computational tests indicate that the proposed algorithm outperforms classical DE in addition to two other variants of DE.


Introduction
Over the last decade, the order acceptance and scheduling (OAS) problem has been widely applied in make-to-order systems.In the existing literature, an order completed past the due date incurs a penalty.There is no reward or penalty for order completion before the promised date, for example, the printing and laminating company described by Oǧuz et al. [1].
The just-in-time philosophy, which is attracting more attention currently, refers to the opinion that earliness, in addition to tardiness, should be discouraged.It has promoted the study of scheduling problems in which orders are preferred to be completed just at their respective due dates, and both early and tardy products are penalised.We indeed need to consider the penalty for the early completion of orders in an OAS system for some cases, which is the motivation in Section 2.
A variety of OAS problems under different settings and with various objectives have been reviewed by Slotnick [2].The problem is classified into categories in terms of problem characteristics, objectives, and solution methods.There are also some papers to study a single machine scheduling problem with the objective of minimising the total tardiness [3,4].In the existing literature, we focus on reviewing studies on the OAS problem with time-related penalties and sequence-dependent setup times.
For the single machine total tardiness problem with sequence-dependent setup times, Anghinolfi and Paolucci [5] proposed a novel discrete particle swarm optimisation (DPSO) approach to solve this nondeterministic polynomial time-(NP-) hard problem, and Tasgetiren et al. [6] provided a discrete differential evolution (DE) algorithm for the same problem.Oǧuz et al. [1] studied the OAS problem, where orders were defined by their release dates, processing times, due dates, deadlines, sequence-dependent setup times, revenues, and tardiness penalty weights.The authors assumed that the revenue of each accepted order is a function of its tardiness and deadline.They provided a mixed integer linear programming (MILP) formulation of the problem and developed three heuristic algorithms to solve this problem.Cesaret et al. [7] studied the OAS problem with the same setting as Oǧuz et al. [1] and developed a tabu search algorithm.The proposed tabu search outperforms the three heuristics proposed by Oǧuz et al. [1] in terms of solution quality.For the same problem setting, Lin and Ying [8] and Cheng et al. [9] developed an artificial bee colony algorithm and improved genetic algorithm.The deviation from the upper bound of the benchmark instances was further improved by the two algorithms.
There are some studies on the scheduling problem with earliness/tardiness penalties in a single machine environment.Hassin and Shani [10] studied several scheduling problems.They considered earliness and tardiness (E/T) penalties and developed polynomial time algorithms for special cases of these problems with nonexecution penalties.Chen et al. [11] formulated a mathematical model for production scheduling subject to multiple constraints, including sequence-dependent setup costs, nonexecution penalties, and E/T penalties.The authors considered the E/T penalty weight as a special value (equal to one) to calculate.Baker [12] assumed that processing times follow normal distributions and due dates are decisions.The author developed a branch and bound algorithm to determine optimal solutions to minimise the total expected E/T costs and tested some heuristic procedures.Shabtay [13] analysed a model that integrates due date assignment and scheduling decisions.The objective was to minimise the total weighted earliness, tardiness, and due date assignment penalties.
To the best of our knowledge, the tardiness penalty and earliness penalty are seldom jointly considered for OAS problems in current research.In this study, we simultaneously consider the tardiness penalty and earliness penalty for the OAS problem with a sequence-dependent setup time on a single agile earth observation satellite.Inspired by agile earth observation satellite scheduling, we formulated a general MILP model for this problem.The formulated general MILP model can be also applied in manufacturing system, logistics management, and make-to-order systems, as oversubscribed scheduling problem also exists in these fields.
The remainder of the paper is organised as follows.We illustrate the motivation for this paper in Section 2. In Section 3, we formally define the OAS problem with time-dependent earliness-tardiness penalties.In Section 4, we present the mathematical model for this problem.In Section 5, we provide details for the proposed algorithm for solving the problem.In Section 6, we present the experimental studies.In Section 7, we provide the conclusion.

Motivation
With the expansion of earth observation satellites (EOSs) application, such as earth resource exploration, forest fire warning, volcano eruption discovery, and earthquake surveillance, EOSs become more limited and expensive relative to numerous orders submitted by users from different departments.The satellite operation firms that can operate on a make-to-order basis realize benefits in terms of timely image availability and decreased risk to imaging randomness.On the other hand, limitations on satellite resource force these firms to be selective on user orders.Users giving orders typically quote a due image quality and may penalise the firm for not-good quality image.This translates into reduced revenue and even loss of users.
Our motivation originates from the agile earth observation satellite (AEOS) scheduling problem.Compared with the traditional earth observation satellite, the AEOS is mobile along three axes (roll, pitch, and yaw).This mobility considerably increases the length of the time windows for imaging targets.The pitch angle required for the satellite to image the target changes in the time window.When the satellite flies over the target, the pitch angle required is zero and the best quality image is obtained.If the target is observed earlier or later than this time, a relatively larger pitch angle is required and a deterioration of image quality is incurred, as illustrated in Figure 1, which is partially taken from Lemaître et al. [14].
Consequently, the order processing time window for this target is from the earliest start time to the latest end time.Earth observation satellites are scarce and oversubscribed resources.A satellite can at most image one target at any time moment and cannot image all the user-requested targets because of its limited capacity and ability.When imaging targets in reality, the pitch angles required always differ from one target to another; therefore, the setup times depend on the imaging sequence of the targets.Thus, simultaneous decisions have to be made to choose which targets are to be imaged and fix the appropriate start time of imaging.We formulate this problem as an OAS problem, considering the time-dependent earliness and tardiness penalties with the objective of maximising total revenue.In the remainder of this paper, we refer to this problem as the OAS with timedependent earliness-tardiness penalties (OASET) problem.

Problem Definition
In a single AEOS environment, a set of jobs are initialised at the beginning of the planning horizon.The satellite can process one job at most at any time moment without preemption.There are no precedence constraints among the jobs, and the production system does not break down.The description of the job is shown in Figure 2.
For convenience, the notations used in the remainder of the paper are summarised in the Notations.
Note that the setup time depends on the processing sequence of two consecutive jobs.Thus   is not necessarily equal to   .
The revenue of a job is defined as a piecewise linear function that depends on the earliest start date, processing time, due date, and deadline as illustrated in Figure 3.The job is not processed yet in region A and the job is not completed yet in region B. In both regions, we obtain no revenue.In regions C and D, the revenue gained decreases linearly with its earliness or tardiness, respectively.In region E, the manufacturer does not accept the job, which is completed beyond its deadline, and no revenue is gained.If the job is completed exactly at the due date, the maximum revenue is obtained.
Given a sequence   of the selected job set , we can calculate the completion time   of each job  ∈ .Let   (  ) be the tardiness of job  in sequence   .We can calculate the tardiness of job  using the formula   (  ) = max{0,   −   }.
The earliness of the job is   (  ) = max{0,   −   }.Thus, the revenue of job  is calculated as Consequently, the total revenue gained from the sequence   is TR(  ) = Σ ∈   (  ).The OASET problem is to determine the set  and sequence   when TR(  ) is maximum.This problem reduces to the OAS problem studied by Oǧuz et al. [1] if the unit earliness penalty cost is set to zero and the release date equals the earliest start date for all orders.The OASET problem generalises the OAS problem, which has been proved to be NP-hard; thus it is still of NP-hard complexity.

The Mathematical Model
In this section, we formulate the OASET problem as an MILP model.We apply two binary variables,  and , to denote order acceptance and sequencing decisions, respectively: MILP is max In this model, we add the definition of earliness for each job with constraints set ( 12), (13), and ( 14) and then update constraint (15) to calculate the revenue the firm can gain when job  is accepted and incurs tardiness   or earliness   .Additionally, the procession time window is characterised by the earliest start date and deadline; therefore, we use the earliest start date es  in constraint (6).

Hybrid Differential Evolutionary Algorithm
Differential evolution (DE), which was first proposed by Storn and Price [15], has become arguably one of the most efficient stochastic real-parameter optimisation algorithms.DE is very simple and easy to implement.Despite this, DE has exhibited better performance than several other algorithms [16].Based on classical DE, many variants have greatly improved performance for a variety of applications.We also propose a hybrid DE (HDE) to solve the problem in Section 5.5.

Graph
Representation.Given a parameter vector, the completion time of each job is fixed.Therefore, the revenue that might be gained from each job is also fixed.To maximise the total revenue, we need to select a subset of jobs that satisfy the sequence-dependent setup time constraints.First, we introduce a directed acyclic graph  of the problem under a given parameter vector.The procedure is described as follows.
Step 1. Compute the procession start time   and completion time   of each job  ∈  under a given parameter vector.
Step 2. Sort the jobs in ascending order by their procession start times, suppose the sorted sequence of all the jobs  is   , and calculate the penalised revenue   (  ) for each job  ∈ .
Step 3. First, add  nodes to graph , where each node  ( = 1, . . ., ) represents job , then as illustrated in the mathematical model, add two dummy nodes including one denoted by "0" that represents the source node and another one denoted by " + 1" that represents the sink node to .
Step 4. For each job  in   and for each job  after job  in   , if the sequence-dependent setup time constraint is satisfied, that is,   +   ≤   , then add a directed arc⟨, ⟩ weighted   (  ) to .
Step 5.For each job  in   , add a directed arc⟨0, ⟩ weighted   (  ) and a directed arc⟨,  + 1⟩ weighted zero to .

Graph-Based Fitness Evaluation.
Based on the directed acyclic graph constructed under a given parameter vector, selecting a subset of jobs maximising the total net revenue is equivalent to determining the longest path from the source node "0" to sink node " + 1" [17].The polynomial time algorithm exists to solve the longest path problem in a directed acyclic graph.We modify the ( 2 ) Bellman-Ford shortest path algorithm to determine the longest path in the graph.The nodes, except the source node and sink node in the path, represent the selected jobs in the solution.In Figure 5, the longest path, which is illustrated by bold arcs, is 0-2-4-5 with total weight 3.2 + 2.5 = 5.7.[15] can be summarised as follows.
(1) Mutation.This operation generates new target vectors according to where  1 ,  2 , and  3 are different integers randomly chosen from {1, . . .,   } and different from the vector index . is a constant factor and is located in [0, 2], according to Storn and Price [15].
(2) Crossover.This operation forms the trial vectors according to where cr is the constant crossover rate, which is often set to 0.9, and  rand is a randomly chosen index from 1 to  to ensure that the trial vector    does not duplicate    .
(3) Selection.This operation determines which parent vector is selected for the next generation from vector   [18] and the widely used DE/rand are combined under self-adaptation strategies.Considering the plateaus in the fitness landscape, the neighbourhood search in Neighbourhood Search Differential Evolution (NSDE) [19], which adopts two random distribution generators for tuning the scaling factor, is applied in our algorithm to help the algorithm escape from plateaus.

Mutation.
In addition to (20), there are several other commonly used mutation operations [20]: We apply self-adaptation strategies [21], and two mutation operations are selected as candidates.One is DE/ran/1, as shown in (20), and the other is DE/current-to-pbest proposed by Zhang and Sanderson [18], given as where   pbest is randomly chosen as one of the top 100% individuals in the current population and x  2 is randomly chosen from the union of the current population and archive.The archive is initiated to be empty.After each generation, the parent solutions that fail in the selection of ( 22) are added to the archive.We randomly remove some individuals in the archive to keep the size within   if the size exceeds   [19].
The mutation process applied here is the same as Selfadaptive Differential Evolution (SaDE) [21], except that (25) used in SaDE is replaced by (26).The trial vector is produced as where  is initialised as 0.5.After each generation, the number of trial vectors successfully entering the next generation generated by ( 20) and ( 26) is recorded as  1 and  2 .The number of trial vectors that fail entering the next generation generated by ( 20) and ( 26) is recorded as  1 and  2 .The four numbers are aggregated within a specific number of generations and the probability  is updated as 5.5.2.Scaling Factor Adaptation.For our problem described in Section 3, we can determine that there are possibly many plateaus in the fitness landscape; that is, there are many vectors with the same objective value.As the population converges, the difference between two vectors |  1 −   2 | tends to be small.To escape from a plateau, various scaling factors are required.We apply the self-adaptive neighbourhood search strategy [19] to set the value of scaling factor: where   (0.5, 0.3) is a randomly generated number according to a normal distribution of mean 0.5 and standard deviation 0.3,   denotes a Cauchy random number generator with scale parameter  = 1, and  is initialised as 0.5 and updated using (28), as does SaDE.

Crossover Probability Adaptation.
The crossover probability cr  for each individual  at generation  is generated using the procedure in JADE.In JADE, cr  is generated according to a normal distribution of mean  cr and standard deviation 0.1, and truncated to [0, 1], where  cr is initialised as 0.5 and at each generation updated according to where  cr is the set of all successful cr  's at generation  and mean  ( cr ) is the usual arithmetic mean of  cr .We set  as a constant value 0.1.The initial population size of all the problems was set to 40.The algorithm stopped if it reached a maximum of 2,000 generations.On each instance, the algorithm ran 10 times ( = 10) independently.The maximum, minimum, and average fitness in the 10 runs were recorded.The results of DE [15], NSDE [19], and JADE [18] were also recorded.The results of datasets with different jobs are given in Tables 2-5.The average relative percentage deviation (Δ avg ) was calculated as follows:

Experimental Studies
where HDE ℎ and REF ℎ are the fitness function values generated by the HDE algorithm and the reference algorithms (including DE, NSDE, and JADE algorithms), respectively, for each run and  is the number of runs.Similarly, Δ min and Δ max denote the minimum and maximum of the relative percentage deviations, respectively, from the reference fitness function values taken.

Performance Comparisons.
For small problems with 20 jobs, Tables 2 and 3 show that the HDE algorithm performed better than DE and NSDE and slightly better than JADE on all instances.Compared with DE and NSDE, the average relative percentage improvements for HDE are approximately 3 generally, of which the best is 5.08 and the  For medium sized problems with 60 and 80 jobs, compared with NSDE and JADE, our proposed HDE provides significantly better results on instances 80 1, 80 3, 80 4, 80 6, 80 7, 80 8, 80 9, and 80 10, as shown in Tables 1 and 4. On those instances, even the worst fitness function value determined by HDE is better than the best fitness function values determined by NSDE and JADE.On other instances, the maximum and average fitness function values of HDE are better than those of NSDE and JADE.Both NSDE and JADE outperform classical DE; in particular, the average relative percentage improvements by HDE compared with DE are almost five times better than the ones compared to NSDE and JADE.NSDE performs slightly better than JADE in terms of all three indicators.
For large problems with 100 jobs, the results are shown in Tables 1 and 5. Our proposed HDE significantly outperforms DE, NSDE, and JADE.In 10 independent runs, the worst fitness value of HDE is even better than the best fitness value of DE, NSDE, and JADE on instances 100 1, 100 2, 100 6, 100 7, 100 8, 100 9, and 100 10.On the other three instances 100 3, 100 4, and 100 5, the maximum and average fitness of 10 runs are better than those of DE, NSDE, and JADE (the details of result are in http://pan.baidu.com/s/1eSf8jzs).

Conclusions
To the best of our knowledge, this is the first attempt to study the OAS problem with time-dependent earlinesstardiness penalties and sequence-dependent setup times, which is inspired by AEOS scheduling.We proposed hybrid DE algorithm under self-adaptation framework, including self-adaptation strategy for mutation operator, the selfadaptive neighbourhood search strategy for scaling factor, and crossover probability adaptation.The significance of selfadaptation framework is proved by the experimental results.Furthermore, we presented a novel method to evaluate the fitness for this problem.Firstly a directed acyclic graph (DAG) is constructed under a given parameter vector, and then the fitness evaluation phase is transformed to finding the longest path from the source node to sink node in the DAG.The graph-based fitness evaluation provides an efficient method for the similar problems.Computational results show that our proposed algorithm outperforms classical DE in addition to two variants: NSDE and JADE.The proposed method not only solves the OASET problem but also suggests an effective way to apply a continuous algorithm to combinatorial optimisation problems.In regard to future research, the dynamic OAS problem catches our attention for considering the orders come stochastically during the planning horizon.

Figure 1 :Figure 2 :Figure 3 :
Figure 1: Illustration of the AEOS imaging target on the sea.

5. 1 .
Parameter Vector.For an OASET problem with all  jobs, an -dimensional real vector  = ( 1 ,  2 , . . .,   ) represents an individual, where 0 ≤   ≤ 1,  = 1, . . ., .The value of   for job  represents the proportion of the part time before job  is completed in the completion interval [es  +   ,   ].The completion time   of job  is calculated by   = (es  +   ) +   (  − es  −   ).In such a representation, the completion time of each job  is located in the completion time window CTW  = [es  +   ,   ].

Figure 4 :Figure 5 :
Figure 4: Example of jobs in ascending order by their procession start time.

Indices
, : Job index, ,  = 1, 2, . . .,  Factors : Set of all jobs : Set of selected jobs,  ⊆    : Sorted sequence of jobs in    : Sequence of jobs in    : Release date of job  es  : Earliest start date of job

Table 1 :
[22]meters of the instances.number of jobs and  is the index of the instance in the dataset with  jobs.Bülbül et al.[22]proposed a method for generating test instances for the single machine earliness/tardiness scheduling problem.We apply this method and add the deadline and setup time generation methods to generate our instances.On each instance, the processing time   is a random number generated uniformly in [ min ,  max ]; the due date  is generated in [1 − TF − RDD/2, 1 − TF + RDD/2] * (  + max    +   ), where   is the total processing time of all jobs; TF is the tardiness factor, which is a coarse measure of the proportion of jobs that might be tardy; RDD determines the interval length of the due date; and the term max    +   is added to   to prevent the possible initial infeasibility of   .Earliest start date es  is generated uniformly in [0,   ], setup time   is generated in [ min ,  max ], deadline   is set as   + (  − es  )  , and revenue   is a randomly generated number in [ min ,  max ].These parameters are listed in Table1.
6.1.Data Generation.Five datasets are generated with 20, 40, 60, 80, and 100 jobs.In each dataset, 10 instances are generated.Each instance is denoted as   , where  is the

Table 3 :
The solution improvements by HDE compared to DE/NSDE/JADE for 20/40 jobs.

Table 4 :
The solution improvements by HDE compared to DE/NSDE/JADE for 60/80 jobs.

Table 5 :
The solution improvements by HDE compared to DE/NSDE/JADE for 100 jobs.Compared with JADE, the average relative percentage improvements for HDE are all less than 1, of which the best is 0.88 and the worst is 0.03.There is no significant difference between the performance of DE and NSDE, but for small problems with 40 jobs, the maximum and average fitness function values of NSDE and JADE are much better than those for DE, as shown in Table2.HDE achieves the maximum fitness among the four algorithms.In Table3, the average relative percentage improvement of HDE compared with DE is appropriately 10 times better than that compared with NSDE and JADE, and JADE and NSDE achieve similar performance.