Optimal Computing Budget Allocation for Ordinal Optimization in Solving Stochastic Job Shop Scheduling Problems

We focus on solving Stochastic Job Shop Scheduling Problem (SJSSP) with random processing time to minimize the expected sum of earliness and tardiness costs of all jobs. To further enhance the efficiency of the simulation optimization technique of embedding Evolutionary Strategy in Ordinal Optimization (ESOO) which is based onMonte Carlo simulation, we embed Optimal Computing Budget Allocation (OCBA) technique into the exploration stage of ESOO to optimize the performance evaluation process by controlling the allocation of simulation times. However, while pursuing a good set of schedules, “super individuals,” which can absorb most of the given computation while others hardly get any simulation budget, may emerge according to the allocating equation of OCBA. Consequently, the schedules cannot be evaluated exactly, and thus the probability of correct selection (PCS) tends to be low.Therefore, we modify OCBA to balance the computation allocation: (1) set a threshold of simulation times to detect “super individuals” and (2) follow an exclusion mechanism to marginalize them. Finally, the proposed approach is applied to an SJSSP comprising 8 jobs on 8 machines with random processing time in truncated normal, uniform, and exponential distributions, respectively. The results demonstrate that our method outperforms the ESOO method by achieving better solutions.


Introduction
Most classical job shop scheduling problems assume that all the problem data are fixed and known in advance.However, there are several inevitable stochastic factors in real manufacturing systems, for example, random processing time, machine breakdowns, and rush orders, among which random processing time is the most fundamental and representative uncertain factor.Therefore, research in Stochastic Job Shop Scheduling Problem (SJSSP) with random processing time has a great importance in engineering applications.Even though, research works on SJSSP are fewer than those on deterministic JSSP because of disadvantage caused by random processing time, for example, huge search space, lengthy computation time, and challenging evaluation of schedules.
In general, three models are used to denote random processing time: interval number method [1,2], fuzzy theory [3][4][5][6], and stochastic method.However, the fluctuation and distribution of random processing time are ignored by the first two mentioned methods, which leads to the inaccurate scheduling solutions.For this reason, an increasing number of researchers use stochastic methods to denote random processing time.Independent random distributions with a known mean and variance are used to represent processing duration variability [7,8].Among all processing time distributions, normal, exponential, and uniform distributions are commonly found in SJSSP literatures [2,[9][10][11][12].In order to solve these NP-hard problems, many heuristic algorithms such as genetic algorithm [13], variable neighbourhood search [14], and artificial bee colony algorithm [15] are introduced to solve SJSSP.
However, random processing time of operations contributes to the randomness of completion time for each job and performance indicators of schedules, which leads to the difficulty of evaluating feasible schedules in the evolution of these heuristic algorithms.Stochastic simulation method, for example, Monte Carlo, which relies on repeated random sampling to obtain statistic results, has been widely used to solve performance evaluation [7,9,11].
Zhang and Wu [15] proposed that stochastic simulation definitely increased the computational burden because of frequent evaluations, especially when used in an optimization 2 Mathematical Problems in Engineering framework.Ho et al. [16] firstly developed Ordinal Optimization (OO) theory to obtain good enough solutions through ordinal comparison while the value of a solution was still very poor.Due to the fact that Evolutionary Strategy (ES) could optimize the sampling process in OO theory, Horng et al. [8] embedded ES in Ordinal Optimization (OO), abbreviated as ESOO, to search for a good enough schedule of SJSSP using limited computation time.
Even though ESOO could significantly reduce the computation for SJSSP, the evaluation method in the exploration stage of ESOO is similar to Monte Carlo Simulation, in which uniform computation is allocated to each schedule, regardless of whether or not it should.Thus extra computation is allocated even after the Probability of Correct Selection (PCS) has already converged.Therefore, there is potential to further enhance ESOO's efficiency by intelligently controlling evaluation process or by determining the optimal number of simulation times among different schedules according to their performances.Chen et al. [17] firstly proposed Optimal Computation Budget Allocation (OCBA) to enhance the efficiency of OO by allocating simulation times reasonably.In the recent years there are many articles about OCBA [18][19][20].In order to optimize the computation allocation in the evaluation process of solving SJSSP, we firstly propose an innovative hybrid algorithm of embedding OCBA into ESOO, abbreviated as ESOO-OCBA.
However, in the OCBA [17] deduction we find the hypothesis that the simulation times of the best individual outnumber considerably the average ones is not the case in specific SJSSP environment.As a result, "super individuals, " which could absorb most of the given computation in each generation while others could hardly get any simulation budget, may emerge according to the allocating equation of OCBA [17].Thus we make improvements on OCBA to balance the computation allocation: (1) set a threshold of simulation times to detect "super individuals, " (2) follow an exclusion mechanism to isolate them, and (3) allocate the existing computation budget to the remained individuals according to the dispatching rules of OCBA.The proposed modification on OCBA is another contribution of this paper, which makes ESOO-OCBA suitable to solve SJSSP.
The rest of this paper is organized as follows: Section 2 defines the problem, presents a mathematical equation of the SJSSP, and improves a model for evaluating schedules.Section 3 outlines our ESOO-OCBA algorithm for finding a good enough schedule from the search space of SJSSP.Section 4 demonstrates the modifications on OCBA that we made.Section 5 shows and discusses the computational experiments and the results.Finally, in Section 6, we present our concluding remarks and discuss several future research directions.

SJSSP Formulation and
Varying Evaluation Model Let Ω denote the set of all the feasible and unfeasible schedules; a feasible schedule  should satisfy both precedence constraint and capacity constraint simultaneously.Without loss of generality, it is assumed that all data are integers and no preemption is allowed.The goal of SJSSP is to find a feasible schedule  ∈ Ω that minimizes the expected sum of earliness and tardiness costs of all jobs.
Objective function: A feasible schedule  ∈ Ω should be subject to Constraints ( 2) and (3) represent the tardiness and earliness cost of each job, respectively.The precedence constraint (4) ensures that, for each pair of consecutive operations  , and  ,+1 of the same job , operation  ,+1 cannot be processed before operation  , is completed.The capacity constraint (5) ensures that two operations of two different jobs,   and   ,  , and  ,ℎ , in the set (  ) cannot be processed simultaneously on machine   .

Performance Evaluation Model Improved from the Objective Function.
To obtain a good statistical estimate for a feasible schedule, a large number of simulation replications are usually required for each schedule.However, the expected objective value of a feasible schedule is available only in the form of a complex calculation via infinite simulated replications.Although infinite replications of simulation will make the objective value of (1) stable, this calculating method actually is intractable.
Therefore, depending on the amount of simulated replications, (1) can be approximated as follows: for a feasible schedule ;  represents the number of its simulation replications.     () and      () denote the tardiness and earliness costs of job   on the th replication of , respectively.  () denotes the average sum of tardiness and earliness costs of  when the simulation length is .Sufficiently large  will make the objective value of (6),   (), sufficiently stable.Let   = 10 5 represent the sufficiently large  [8].Let    () represent the objective value of  computed by sufficiently exact evaluation model.

Embedding the OCBA Technique into ESOO.
To evaluate the performance of a feasible schedule  reliably needs a complex calculation of    (), not to even mention the huge search space of SJSSP.ESOO could significantly reduce the computation for evaluation process [8].However, in ESOO, uniform computation is allocated to each individual, regardless of whether or not it should.This allocation cannot meet the different demands of different individuals; thus computation allocated to each individual may be either insufficient or redundant.
Ideally, overall simulation efficiency will be improved if less computational effort is spent on simulating noncritical schedules and more is spent on critical schedules.We would like to improve the PCS in each generation by allocating computation according to the performance of each schedule.Therefore, in this paper, OCBA technique is embedded into the exploration stage of ESOO algorithm to intelligently determine the optimal number of simulation times  for different individuals according to their performances.
SJSSP is a NP-hard problem, which reflects the realworld situations and always suffers from uncertain feature.Recently, many methods solving SJSSP suffer from lengthy computation budget, because evaluating the objective of a schedule is already very time-consuming not to even mention the extremely slow convergence of the heuristic techniques in searching through a huge search space.In order to overcome the drawback of consuming much computation time, we propose the ESOO-OCBA algorithm.In our ESOO-OCBA algorithm, OO theory reduces the unbearable computation and OCBA technique allocates necessary computation to each individual.The overall scheme of our algorithm is illustrated by Figure 1.
From the Figure 1, the NP-hard characteristic and the randomness of SJSSP contribute to many difficulties, for example, huge search space, performance evaluation problem, and slow convergence.OO theory contains two fundamental Ideas: (1) ordinal comparison, that is, ordinal is used rather than cardinal optimization in order to reduce the simulation times in evaluating schedules; (2) goal softening is used to decrease the degree of searching difficulty.It is proved that ordinal comparison has an exponential convergence rate [21,22] and that goal softening can raise the Probability of Alignment (PA) exponentially [23].
ESOO-OCBA algorithm consists of two stages.(1) The exploration stage aims to find a subset of good samples from the search space, where samples are evaluated by a crude evaluation model.Evolutionary strategy is employed in this stage to optimize the sampling process while OCBA is used to trade off the performance stability and save the computation of each individual by allocating the simulation times reasonably.(2) The exploitation stage that consists of multiple subphases is used to find out the good enough individual in the good sample subset.Individuals are selected and eliminated by increasingly accurate evaluation models in each subphase.The one with the smallest    () in the last subphase,  * , is the good enough schedule that we seek.

The Exploration Stage.
In exploration stage, we use ES as a whole frame of sampling process.The number of needed samples depends on the extent of goal softening, which decides the number of generations and the number of initial population.In each generation of ES, each individual could get a unique crude evaluation model developed from (6) with different simulation times  allocated by OCBA.The implementation of exploration stage is as follows [8].
Precedence-Based Representation.We use a precedence-based presentation to define a chromosome with unpartitioned permutation of   repetitions of each job   .The operations of the same job should satisfy its precedence constraints to form an individual standing for a feasible schedule.This encoding method is employed due to its effectiveness on generating feasible individuals and on reducing the size of search space.Initial Population.Each individual of the initial population is generated with a completely random method to enhance the variety of the initial population.
Recombination.Generate offspring from the parents by discrete recombination [24].Use repair operator to adjust infeasible schedule when it appears [8].
Mutation.The insertion mutation method is adopted to breed new offspring.Delete all operations of one random selected job within a parent chromosome and then reinsert them into the remained components randomly according to the precedence constraints [8].
Selection. + -selection mechanism is adopted in our approach.Select the best  individuals from both the  parents and  offspring, according to the ranking of the approximate fitness values obtained from the respective crude evaluation model of each individual.
Termination.The ES is stopped when the number of generations exceeds what we set; select the best  individuals as the good sample set according to their performances.good sample set obtained in the exploration stage.However, evaluating each of the  schedules by sufficient accurate evaluation model (objective function with  =   ) costs too much computation.Thus the exploitation stage is divided into multiple subphases according to the idea of iterative use of ordinal optimization [25].The computational complexity of exploitation stage can be dramatically decreased, as the size of the schedule set in each subphase has already been largely reduced when the evaluation model is more refined.The implementation of exploitation stage in [8] is adopted in this paper.In each subphase, the objective function equation ( 6) with various simulation times  is used as an increasingly accurate evaluation model, where  ranges from a given fixed   (crude model in the first subphase) to   (sufficient accurate model in the final subphase).The remained schedules in the prior subphase are selected and some of them are eliminated according to their performances.In the last subphase, the one with the smallest    () is the good enough schedule  * that we look for.

The OCBA Technique and Modifications
4.1.The OCBA Technique.OO theory usually allocates uniform computation to each schedule, which can hardly achieve the highest PCS within a given computation.OCBA technique [17] is employed to improve the PCS by allocating simulation times to each schedule according to their performances.We define   as the simulation times allocated to schedule   and set  as the total given simulation times in each generation (i.e., total given computation).OCBA provides a recipe of asymptotically optimal allocation of simulation times among schedules in each generation of the exploration stage in ESOO-OCBA algorithm: where  2  is the observation variance of schedule   which can be approximated by sample variance and   is the best schedule we observed.   (  ) is the observed performance of schedule   ;  , =    (  ) −    (  ).

4.2.
Modifications on OCBA Technique.However, the aforementioned OCBA is observed to be unavailable to solve the specific SJSSP in our experiments.While pursuing a good set of schedules in the exploration stage, "super individuals" may emerge according to ( 7)-( 9), which could absorb most of the given simulation budget in each generation while other individuals could hardly get any simulation budget.
Without enough simulation times , most of the average individuals in each generation cannot be compared with each other exactly by (6).Therefore, this simulation distortion phenomenon will absolutely decrease the probability of correct selection as some inferior schedules may be selected while some good schedules may be eliminated.
We try to find the root cause of "super individuals" by dating back to the deduction of the original dispatching ( 7)-( 9) [17].After a series of deductions, Chen et al. [17] got (10) which could express the relationship between   ,   , and   .In order to simplify (10), Chen et al. [17] assumed   ≫   according to (8).Therefore (10) could be simplified as (11), and then the ratio between   and   was deducted from (11) which was expressed in (9).Consider the following: However, in our experiments, the performance of individual   can be extremely similar to that of the best individual   ; thus  , =    (  ) −    (  ) can be remarkably small.In this condition, according to (9), the simulation times   of individual   will outnumber considerably that of an average individual   .Then we can get   ≈   from (8); that is, the assumption of Chen et al. [17],   ≫   , is not the case in the specific SJSSP environment.As (10) cannot be used directly in dispatching simulation times for its complexity, modifications are made on the original dispatching rule equations ( 7)-( 9) to make the classical dispatching rule available in SJSSP.
In order to diminish the negative influences from "super individuals" in each generation, two steps must be taken: (1) set a threshold of simulation times to detect "super individuals" and (2) follow an exclusion mechanism to isolate them.
Deducting from ( 7)-( 9), ( 12) is used to allocate simulation times   to each schedule   : Modifications are made on (12) to realize the mentioned two Steps.Firstly, we set   = 10 5 as the threshold of simulation times because it is the sufficient simulation times for evaluation.The individual will be seen as a "super individual" once it obtains more than   simulation times.Secondly, we define all the "super individuals" as a set Θ and the scale of Θ as num and then pick out Θ from the set of all the individuals.Then from the total given , deduct the simulation times which are occupied by "super individuals, " where we set that each "super individual" occupies   simulation times.Lastly, allocate the existing simulation times to the individuals except for "super individuals" according to the dispatching rules.Based on these steps, we improve ( 12) to (13) as follows: ) Equation ( 13) is one of the main contributions of our study.It can be used (i) to allocate computation in the simulation optimization of SJSSP reasonably and effectively or (ii) to make simple estimates for designing experiments in solving SJSSP.

Implementation of the Modified OCBA.
We adopt the cost-effective sequential approach based on OCBA which is described as follows [17]: (1)  0 simulation replications for each individual are conducted to get some initial information about the performance of each individual.As simulation proceeds, the sample means and sample variances of each schedule are computed from all the data that are already collected up.(2) According to this collected simulation output, an incremental computing budget, Δ, is allocated to the set of all the individuals.Ideally, each new replication should bring us closer to the optimal schedules.(3) This procedure is continued until the total given  is exhausted and then ( 13) can be improved as follows (  denotes the simulation time that has already been consumed): The implementation of Optimal Computing Budget Allocation (OCBA) in each generation in the exploration stage of ESOO.
Step 1. Perform th simulation replications for all individuals;  = 0; Step 3. Increase the computing budget (i.e., number of additional simulation times by  and compute the new budget allocation,   In the OCBA steps above,  is the iteration number and  is the number of initial population.What needs to be remarked is the best individual  which may change from iteration to iteration.

SJSSP Test Instance with Three Processing Time Distributions.
In order to demonstrate the computational quality and efficiency of our ESOO-OCBA algorithm, numerical experiments on SJSSP comprising 8 jobs on 8 machines [8] have been carried out.( ,ℎ ,  ,ℎ ,  2 ,ℎ ) is given in Table 1, which is used to denote the operating environment of operation  ,ℎ . ,ℎ denotes the processing sequence of  ,ℎ ,  ,ℎ and  2 ,ℎ denote the mean and variance of stochastic processing time  ,ℎ , respectively.The due dates   of each job   are given in Table 2.The tardiness penalty per unit time   and the earliness penalty per unit time   for each job  are set to 1.
Three distributions of random processing time on the machines are used to test the computational efficiency and the obtained schedules quality of our algorithm.The first distribution is truncated normal distribution with mean  ,ℎ and variance  2 ,ℎ .The second distribution is uniform distribution in the interval [ ,ℎ − 3 ×  ,ℎ ,  ,ℎ + 3 ×  ,ℎ ].The third distribution is exponential distribution with mean  ,ℎ .
In the exploration stage of ESOO-OCBA, we set the number of initial population as  = 1000 and the number of offspring as  = 2000.In order to compare the efficiency between ESOO-OCBA and ESOO [8], the same total given simulation times in each generation of the exploration stage are set as  = 368 × , and the number of generations is set as  max = 100.It is well understood that a small initial simulation time of each individual,  0 , can contribute to more flexibility for better allocation of the computing budget.Nevertheless, if  0 is too large, we may waste too much computation budget in simulating nonpromising designs.Intuitively, if the total computing budget, , is very large, the effect of  0 should be less important.In the dispatching process of OCBA technique in each generation, we know that  is very large and so we set  0 = 33.
In addition, the selection of incremental computing budget, Δ, is typically problem-specific.A large Δ can lead to waste of computation time to obtain an unnecessarily high confidence level.Nevertheless, if Δ is too small, we need to carry out the budget allocation problem many times.So according to the total computing budget, , we set Δ = 66000.  = 10 5 is set as the threshold for detecting "super individual." We start from randomly generating  individuals as the parent population.After an evolution of  max generations, we rank all the remained  +  = 3000 individuals (parents and offspring) based on their performances and select the best  = 1000 individuals.
In the exploitation stage of ESOO-OCBA, we adopt all the parameters used in related works [8].Table 3 shows the number of subphases, the simulation length, and the number of candidate schedules in each subphase.In the last subphase, we compute the exact object value    () of the  6 = 7 candidate schedules.The one with the smallest    () is the good enough schedule  * that we look for.

Test Results of Modified OCBA.
In order to show the advantages of our modifications on OCBA, we choose a random generation from the exploration stage of ESOO-OCBA in solving SJSSP with truncated normal distributed processing time.We compare our modified OCBA with classical OCBA [17] by allocating simulation times to all individuals ( = 2000), respectively.The test results are shown in Figures 2, 3, and 4.
Figure 2 shows the general results of the realized allocation by the two analyzed OCBA techniques considering the different individuals, allocated simulation times, and performances as comparison criteria.Figure 3 describes the relation between the simulation times and the individual performances in detail and also helps to understand the differences between classical OCBA and modified OCBA. Figure 4 illustrates the distribution of simulation times of each individual, demonstrating the improvements of modified OCBA compared with traditional OCBA in dispatching simulation times.
In Figure 3, we can see that, with the individual performance (objective value) rising, the simulation times allocated to the better individuals increase under the influence of the allocation mechanisms in both classical and modified OCBA.This increment meets the demand of more simulation times when better individuals need to be evaluated exactly in the evaluation process.
Also, for most of the individuals, more simulation times are allocated by our modified OCBA than by classical OCBA.The reason of this phenomenon lies in the mentioned "super    individuals" produced by the classical OCBA dispatching rules (see ( 7)-( 9)).Besides, for the total of individuals ( = 2000) in the generation, the only two labeled "super individuals" absorb 85.75% of the total simulation times, while all the other individuals are allocated with low share (14.25%) of total simulation times.
In Figure 4, a more detailed representation of the distribution of simulation times allocated to individuals can be observed: in the test results of classical OCBA, the simulation times allocated to different individuals have no apparent alterations, even if they have a huge performance gap.It is also clear that 92.85% of all the individuals (1857 of the total 2000) are allocated between 33 and 100 simulation times by classical OCBA.This lack of simulation times leads to a simulation distortion in the evaluation process, not being able to detect the variations in different individual performances, which absolutely decreases the PCS in this generation.
For modified OCBA, 97.85% of all the individuals (1957 of the total 2000) are allocated from 33 to 1000 simulation times.After we use the threshold   = 10 5 to limit the "super individuals, " a larger portion of the total simulation times are allocated to average individuals (74.04%).This situation leads to high performed individuals getting much more simulation times even if they are slightly better (this effect can be observed in Figure 3).This improved allocation contributes Because of the random nature of the considered problem we also have repeated the simulation process for 10 simulation runs.We have found that after the 10 simulation runs the result changes a little.Table 4 shows the best objective values, the best schedule performance obtained by ESOO-OCBA and ESOO, respectively.The processing sequence of the best schedules is showed in the table;  , denotes the th operation on job .Data from truncated normal distribution, uniform distribution, and exponential distribution are all showed in Table 4.As can be observed, our ESOO-OCBA algorithm outperforms the ESOO algorithm for these three distributions in the quality of the results.
In order to compare our algorithm ESOO-OCBA, we adopt the same total simulation times as ESOO.However, because of the computational burden caused by the embedded OCBA technique, the overall consumed CPU times in our experiments are slightly longer than the consumed by ESOO (within 6 minutes), but still short enough to apply our algorithm in real time.
Also, as we can see from Figure 5, the convergence rate in the exploration stage of ESOO-OCBA is significantly faster than that in ESOO.Here in ESOO-OCBA and ESOO random processing time is obtained from truncated normal distribution.This result demonstrates that introducing OCBA into ESOO algorithm really improves the efficiency of ESOO.

Conclusion
To cope with the computationally intractable SJSSP, we firstly embed OCBA technique into the exploration stage of ESOO algorithm to further enhance ESOO's efficiency by intelligently allocating simulation times according to individual performance.However, "super individuals, " which lead to a simulation distortion in the evaluation process, may emerge according to the classical OCBA.Then we set a threshold to constrain the simulation times allocated to "super individuals, " by which more simulation times can be allocated to other average individuals.The improvements on classical OCBA optimize the simulation times allocation mechanism, which guarantee a high probability of correct selection in each generation of the evolution in exploration stage.
The proposed algorithm ESOO-OCBA is applied to a SJSSP comprising 8 jobs and 8 machines with random processing time in truncated normal, uniform, and exponential distributions.The simulation test results obtained by ESOO-OCBA are compared with ESOO algorithm, demonstrating that our algorithm has superior performances in the aspect of schedule quality, and our modifications on OCBA are more reasonable in allocating computation in the evaluation.
The future research on SJSSP can be conducted from the following aspects.
(1) It is worthwhile to consider other types of randomness in job shops, for example, rush orders and machine breakdowns.
(2) It is worthwhile to consider a new global computation allocation mechanism (i.e., the breadth versus depth approach [26]) as OCBA technique only allocates the computation within each generation.Ideally, the total computation can be largely reduced by allocating computation globally.

Figure 2 :
Figure 2: 3D view of the comparison between modified OCBA and classical OCBA.

Figure 3 :
Figure 3: Relationship between individual performance and allocated simulation times.

Figure 4 :
Figure 4: Distribution of allocated simulation times for individuals.
operations,   = { ,1 ,  ,2 , . ..,  ,ℎ  }, that need to be processed in sequence.The operation  ,ℎ (1 ≤ ℎ ≤ ℎ  ) denotes the ℎth operation on job  and  ,ℎ ∈   must be processed by a specified machine ( ,ℎ ) ∈ .Similarly, the set of operations that must be processed on each machine,   (1 ≤  ≤ ), is denoted by (  ).The random processing time of  ,ℎ is denoted by  ,ℎ , which is a random variable following a given probability distribution function with mean  ,ℎ and variance  2 ,ℎ .Let  ,ℎ be the start time of  ,ℎ .The completion time and fixed due date of job   ∈  are denoted by   and   , respectively.  and   stand for the tardiness and earliness cost of job   .Set   and   as the tardiness and earliness penalty per unit time for job   , respectively.

Table 1 :
Operation environment vector of SJSSP with 8 jobs and 8 machines.

Table 2 :
Due dates for each job.

Table 3 :
Number of candidate schedules and simulation times in each subphase.