A Three-Stage Optimization Algorithm for the Stochastic Parallel Machine Scheduling Problem with Adjustable Production Rates

We consider a parallel machine scheduling problem with random processing/setup times and adjustable production rates. The objective functions to be minimized consist of two parts; the first part is related with the due date performance (i.e., the tardiness of the jobs), while the second part is related with the setting of machine speeds. Therefore, the decision variables include both the production schedule (sequences of jobs) and the production rate of each machine. The optimization process, however, is significantly complicated by the stochastic factors in the manufacturing system. To address the difficulty, a simulation-based threestage optimization framework is presented in this paper for high-quality robust solutions to the integrated scheduling problem.The first stage (crude optimization) is featured by the ordinal optimization theory, the second stage (finer optimization) is implemented with a metaheuristic called differential evolution, and the third stage (fine-tuning) is characterized by a perturbation-based local search. Finally, computational experiments are conducted to verify the effectiveness of the proposed approach. Sensitivity analysis and practical implications are also discussed.


Introduction
The parallel machine scheduling problem has long been an important model for operations research because of its strong relevance with various industries, such as semiconductor manufacturing [1], automobile gear manufacturing [2], and train dispatching [3].In the theoretical research, makespan (i.e., the completion time of the last job) is the most frequently adopted objective function [4].However, this criterion alone does not provide a comprehensive characterization for the manufacturing costs in real-world situations.In this paper, we will focus on the minimization of operational cost and tardiness cost in a stochastic parallel machine production system.The former is mainly the running cost of the machines for daily operations, and the latter is brought about by the jobs that are finished later than the previously set due dates.
The operational cost, which is further categorized into fixed cost and variable cost, can be minimized by proper settings of the production rate (i.e., operating speed) of each machine.The production rate would produce opposite influences on the variable cost and the fixed cost.Setting the machine speed at a high level is beneficial for reducing the variable cost because the production time is shortened.However, achieving the high speed incurs considerable fixed costs at the same time (e.g., additional machine tools must be purchased).Therefore, an optimization approach is needed to determine the optimal values of the production rates for minimizing the operational cost.
The tardiness cost, which is usually represented by a penalty payment to the customer or a worsened service reputation, could be minimized by scheduling the jobs in a wise manner.Efficient schedules directly help to increase the on-time delivery ratio and reduce the waiting/queueing time in the production system [5].In recent years, tardiness minimization has become a more important objective than makespan for many firms adopting the make-to-order strategy.Under the parallel machine environment, the objective functions that reflect tardiness cost include total tardiness [6], total weighted tardiness [7], maximum lateness [8], and number of tardy jobs [9].
In the literature, the optimization of operational settings (including production rates) is usually performed separately of production scheduling.Actually, there exist strong interactions between the two decisions.For example, if the operating speed of a machine is set notably high, it is beneficial to allocate more jobs to the machine in order to fully utilize its processing capacity.On the other hand, if a large number of jobs have been assigned to a certain machine, it is necessary to maintain a high production rate for this machine in order to reduce the possibility of severe tardiness.Therefore, production scheduling and the selection of machine speeds should better be considered as an integrated optimization problem.A solution to this problem should include both the optimized machine speeds and the optimized production schedule that works well under this setting of machine speeds.
In real-world manufacturing systems, uncertainties are inevitable (due to, e.g., absent workers and material shortage).But the effect of random events has not been sufficiently considered in most existing research.For example, a common assumption in scheduling is that the processing time of each job is exactly known in advance, which, of course, is inconsistent with reality.If the random factors such as processing time variations are considered, we should rely on discrete-event simulation [10] to evaluate the performance of the manufacturing system.So, in this paper, we adopt a simulation-based optimization framework to find satisfactory settings of the production rates together with a satisfactory production schedule.The objective is to minimize the total manufacturing cost: sum of operational cost and tardiness cost.
The rest of the paper is organized as follows.Section 2 makes an introductory review on some topics closely related with our research.Section 3 depicts the production environment and formulates the integrated optimization problem.Section 4 describes the proposed approach for solving the stochastic optimization problem, which includes three stages with gradually increasing accuracy.Section 5 presents the main computational results and comparisons.Finally, Section 6 concludes the paper.

Related Research Background
2.1.Uniform Parallel Machine Scheduling.In the parallel machine scheduling problem, we consider  jobs that are waiting for processing.Each job consists of only a single operation which can be processed on any one of the  machines  1 , . . .,   .As a conventional constraint in scheduling models, each machine can process at most one job at a time, and each job may be processed by at most one machine at a time.
There are three types of parallel machines [11].
(i) Identical Parallel Machines (P).If preemption of operations is not allowed, scheduling uniform parallel machines with the makespan (i.e., maximum completion time) criterion (described as | | max ) is a strongly NP-hard problem.According to the reduction relations between objective functions [11], scheduling uniform parallel machines under due date-based criteria (e.g., total tardiness) is also strongly NP-hard.Therefore, metaheuristics have been widely used for solving these problems.

Simulation Optimization and Ordinal Optimization.
The simulation optimization problem is generally defined as follows: find a solution (x) which minimizes the given objective function (x), that is, where X represents the search space for the decision variable x.The key assumption in simulation optimization is that (x) is not available as an analytical expression; so, simulation is the only way to obtain an evaluation of x.Moreover, since applicable simulation algorithms must have a satisfactory time performance (which means that the simulation should be as fast as possible), some details must have been omitted when designing the simulation model, and thus simulation can only provide a noisy estimation of (x), which is usually denoted as Ĵ(x).However, two difficulties naturally exist in the implementation of simulation optimization: (1) the search space (X) is often huge, containing zillions of choices for the decision variables; (2) simulation is subject to random errors, which means, a large number of simulation replications have to be adopted in order to guarantee a reliable evaluation for x.These issues suggest that simulation optimization can be extremely costly in terms of computational burden.
OO attempts to settle the previous difficulties by emphasizing two important ideas: (1) order is much more robust against noise than value; (2) aiming at the single best solution is computationally expensive, and thus it is wiser to focus on the "good enough." The major contribution of OO is that it quantifies these ideas and thus provides accurate guidance for our optimization practice.

The Differential Evolution
Algorithm.The differential evolution (DE) algorithm, which was first proposed in the mid-1990s [22], is a relatively new evolutionary optimizer.Characterized by a novel mutation operator, the algorithm has been found to be a powerful tool for continuous function optimization.Due to its easy implementation, quick convergence, and robustness, the DE algorithm is becoming increasingly popular in recent years.
Because of its continuous feature, the traditional DE algorithm cannot be directly applied to scheduling problems with inherent discrete nature.Indeed, in canonical DE, each solution is represented by a vector of floating-point numbers.But for scheduling problems, each solution is a permutation of integers.To address this issue, two kinds of approaches can be found in the literature.In the first category, a transformation scheme is established to convert permutations into real numbers and vice versa [23].In the second category, the mutation and crossover operators in DE are modified to discrete versions which suit the permutation representation [24].We would adopt the former strategy, where we only need to modify the encoding and decoding procedures without changing the implementation of DE itself.The clear advantage is that the search mechanism of DE is well preserved.
Despite the success on deterministic flow shop scheduling problems, the application of DE to simulation-based optimization has rarely been reported.To our knowledge, this is the first attempt in which DE is applied to an integrated operational optimization and production scheduling problem.

Problem Definition
3.1.System Configuration.In the production system, there are  jobs waiting to be processed by  uniform parallel machines.The basic processing time of job  is denoted by   ( = 1, . . ., ), and the basic setup time arising when job  is processed immediately after job  is denoted by   (we assume that   > 0 if  ̸ = , and   = 0 if  = ).To optimize the manufacturing performance, two decisions have to be made.
(i) Production Rate Optimization.For each machine , the speed value (denoted by   ) has to be determined.In other words, {  :  = 1, . . ., } belong to the decision variables in the optimization problem.The relationship between these variables and the manufacturing cost will be introduced in the following.
(ii) Production Scheduling.The job assignment policy (i.e., which jobs are to be processed by each of the machines) should be determined.In addition, the processing order of the jobs assigned to each machine should also be specified.
The production rate is related with the controllable processing times and the controllable setup times.Indeed, if the speed of machine  is set as   , then the actual processing time of job  on machine  (denoted as    ) has a mean value of   /  (i.e., E(   ) =   /  ), and the actual setup time between two consecutive jobs  and  on machine  (denoted as    ) has a mean of   /  (i.e., E(   ) =   /  ).In most cases, the production process involves human participation (e.g., gathering materials and adjusting CNC machine status); so, the estimate of processing/setup lengths may not be completely precise.For this reason, we assume that the processing times and setup times are random variables following a certain distribution.

Cost Evaluation.
In order to evaluate the cost corresponding to a given solution (i.e., x = {  , JPO  :  = 1, . . ., }, where JPO  denotes the processing order of the jobs assigned to machine ), simulation is used to obtain the necessary production information (e.g., the starting time and completion time of each job).Then, the realized total manufacturing cost can be calculated by adding the operational cost and the tardiness cost.
The tardiness cost is simply defined as where   is the unit tardiness cost for job  (which reflects the relative importance of the job), and   and   , respectively, denote the completion time and the due date of job .  = (  −   ) + = max{  −   , 0} defines the tardiness of job .Now, we focus on the operational cost, which is further divided into fixed cost and variable cost, as done in conventional financial research.
We will first discuss the fixed cost related with the settings of the production rates {  }.The fixed cost of setting the speed at   for machine  is defined as where   is a constant coefficient related with machine .
The square on   suggests that this type of fixed cost grows increasingly fast with   .In practice, when the machine speed is set notably higher above its normal mode, the energy consumption rises rapidly, and meanwhile, the expenses on status monitoring and preventative maintenance also add to the running cost.So, the previous equation form is defined to reflect such an actual situation.
Based on the previous description, the fixed operational cost can be evaluated as Once we have obtained the complete production information via simulation, we can calculate the variable operational cost, which is related with the operating time length of each machine.In particular, the variable operational cost can be categorized into two types according to the working mode: production cost (VOC  ) and setup cost (VOC  ).These costs are simply defined as follows: where TPT  and TST  , respectively, represent the total production time and total setup time of machine  based on the actual performances;   and   are cost coefficients (positively correlated with   ) known in advance.Thus, the variable operational cost is given by VOC = VOC  + VOC  .
Finally, the total manufacturing cost can be defined as which is exactly the objective function to be minimized in our model.
It should be noted that, due to the systematic randomness, different simulation runs will yield distinct realizations of TMC.As a convention in the practice of simulation optimization, we take the average value of TMC obtained from a large number of simulation replications as an estimate for the expectation of TMC.Specifically, if RTMC  (x) denotes the realized manufacturing cost in the th simulation for solution x, the objective function can be stated as where  is an appropriate number of simulation replications.

The Three-Stage Optimization Algorithm
Since the optimization of production rates and production scheduling are mutually interrelated, we develop a three-stage solution framework as follows.
(1) The first stage focuses on the production rates.The aim is to find a set of "good enough" values for the machine speeds.At this stage, it is unnecessary to overemphasize the accuracy of optimization; so, a fast and crude optimization algorithm will do the job.(2) The second stage focuses on the production schedule.
The aim is to find a schedule that works fine (achieves a low total cost) under the production rates set in the previous stage.Since the objective function is sensitive to job assignment and job sequencing, a finer optimization algorithm is required for this stage.(3) The third stage focuses on the production rates again.
Since the optimal machine speeds are also dependent on the production schedule, the aim of this stage is to fine-tune the machine speeds so as to achieve an ideal coordination between the two sets of decisions.
Based on the previous alternations, the entire optimization algorithm is expected to find high-quality solutions to the studied stochastic optimization problem.The details of the algorithm are given in the following subsections.

Stage 1:
Coarse-Granular Optimization of   .In this stage, we try to find a set of satisfactory values for {  :  = 1, . . ., } using the ordinal optimization (OO) methodology.
Before going into the detailed algorithm description, we show a simple property of the optimal setting of   .Theorem 1.For any two machines  1 and  2 , if the corresponding fixed cost coefficients satisfy   1 >   2 , then in the optimal solution we must have Proof.The proof is by contradiction.Suppose that   1 >   2 and a certain solution has indicated   1 >   2 ; then, this solution can be improved by exchanging the production rates together with the processed job sequences of the two machines.After such an exchange is performed, the variable cost and the tardiness cost will remain the same because the production schedule is actually not changed.However, the fixed cost will be reduced since

Basics of Ordinal Optimization.
We list the main procedure of OO as follows.Meanwhile, we would suggest interested readers to turn to [25] for more theories and proofs.Suppose that we want to find  solutions that belong to the top- (normally  < ).Then, OO consists of the following steps.
Step 1. Uniformly and randomly select  solutions from X (this set of initial solutions is denoted by ).
Step 2. Use a crude and computationally fast model for the studied problem to estimate the performance of the  solutions in .
Step 3. Pick the observed top  solutions of  (as estimated by the crude model) to form the selected subset .
Step 4. Evaluate all the  solutions in  using the exact simulation model, and then output the top  (1 ≤  < ) solutions.
As an example, let  = 50 and  = 1.If we take  = 1000 in Step 1 and the crude model in Step 2 has a moderate noise level, then OO theory ensures that the top solution in  (with  ≈ 30) is among the actual top 50 of the  solutions with probability no less than 0.95.In practice,  is determined as a function of  and ; that is,  = (, ; , noise level), where noise level reflects the degree of accuracy of the crude model.Since  crude model =  exact simulation model +noise, the noise level can be measured by the standard deviation of noise, that is, √Var(noise).Intuitively, if the crude model is significantly inaccurate, then  should be set larger.
For our problem, the solution in this stage is represented by  real values { 1 ,  2 , . . .,   }.To facilitate the implementation of OO, these values are all discretized evenly into 10 levels between [ min ,  max ) (normally we have  min = 1, while the speed limit  max is determined by specific machine conditions).If we assume that  max = 2 in this paper, then   ∈ {1.0, 1.1, 1.2, . . ., 1.9}.Such a discretization reflects the coarse-granular nature of the optimization process in Stage 1.In addition, the conclusion of Theorem 1 helps to exclude a large number of nonoptimal solutions.So, the size of search space is at most 10  .
In the implementation of OO, the crude model (used in Step 2) has to be designed specifically for a concrete problem.Although generic crude models (like the artificial neural network-based model presented in [26]) may be useful for some problems, No Free Lunch Theorems [27] suggest that incorporation of problem-specific information is the only way to promote the efficiency of optimization algorithms.So, here, we will devise a specialized crude model which can provide a quick and rough evaluation of the candidate solutions for the discussed parallel-machine problem.

The Crude Model
Used by OO.Exact simulation is clearly very time consuming because a large number of replications (samplings of the stochastic processing/setup times) are needed to obtain a stable result.In order to apply OO, we devise a crude (quick-and-dirty) model for objective evaluation, which by definition is not so accurate as the exact simulation model but requires considerably shorter computational time.
The crude model presented here is deterministic rather than stochastic, which means that it needs to be run only once to obtain an approximate objective value.The crude model consists of 3 major steps, which will be detailed next.
Step 1. Schedule all the  jobs on an imaginary machine (with speed  = 1).At each iteration, select the job that requires the shortest basic setup time to be the next job.This will lead to a production sequence including alternations of the  jobs and ( − 1) setups.The length of the entire sequence (i.e., summation of all the basic processing times and the basic setup times involved) is denoted as .
Step 2. Split the production sequence into  subsequences such that the length of each subsequence is nearly equal to × (  / ∑  =1   ) ( = 1, . . ., ).The th subsequence constitutes the production schedule for machine .
Step 3. Approximate the tardiness cost, fixed cost, and variable cost.In this step, the total production time of machine  is calculated as , where   is the number of jobs assigned to machine  (i.e., the th subsequence) and  [|] is the basic processing time of the th job in this subsequence.The total setup time of machine  is calculated as is the basic setup time between the th job and the ( + 1)th job in the subsequence related with machine .
When splitting the original sequence (Step 2), the order of each component (production period or setup period) should be kept unchanged.Meanwhile, since each output subsequence is actually thought of as the production schedule for a particular machine, "setup" should not appear at the end of any subsequence.In other words, some setups will be discarded during the splitting step.However, this will hardly influence the subsequent calculations because the total length of discarded setups is normally trivial compared with the complete length ().
The desired length of each subsequence is deduced as follows.First, as this is a parallel machine manufacturing system, we hope the actual completion time of each machine is aligned (which is certainly the ideal situation) so that the makespan is minimized (no time resource is wasted).Then, if we use   to denote the length of the th subsequence (i.e., the summation of all job lengths and setup lengths in this subsequence), the actual completion time of machine  (denoted by   ) is   /  .The completion time alignment condition requires, for all  ̸ =   ,   =    ; that is,   /  =    /   .Solving the equation yields  *  =   / ∑  =1   .A concrete example of the splitting process is shown in Figure 1, where the shaded areas represent setup periods.In this example, we assume that there are two machines with  1 = 1.0 and  2 = 1.5.Thus, the splitting point should be placed at  * 1 = (1/(1 + 1.5)) × 35 = 14.By adopting the nearest feasible splitting policy, the five jobs are allocated to the two machines such that machine 1 should process jobs 2, 3 and machine 2 should process jobs 1, 5, 4.

Stage 2: Optimization of the Production Schedule.
In this stage, we use a simulation-based differential evolution algorithm for finding a satisfactory production schedule.The production rates are fixed at the best values output by the first stage.

Basics of Differential Evolution.
Like other evolutionary algorithms, DE is a population-based global optimizer.In DE, each individual in the population is represented by a -dimensional real vector x  = ( ,1 ,  ,2 , . . .,  , ),  = 1, . . ., SN, where SN is the population size.In each iteration, DE employs the mutation and crossover operators to generate new candidate solutions, and then it applies a one-to-one selection policy to determine whether the offspring or the parent can survive to the next generation.This process is repeated until a preset termination criterion is met.The DE algorithm can be described as follows.
Step 3 (Crossover).For  = 1, . . ., SN, generate a trial solution u  as follows: where   is an index randomly selected from {1, . . ., } to guarantee that at least one dimension of the trial solution u  differs from its parent x  ;   is a random number generated from the uniform distribution U[0, 1]; CR ∈ [0, 1] is the crossover parameter.
Step 4 (Selection).If u  is better than x  , let x  = u  .
Step 5.If the termination condition is not satisfied, go back to Step 2.
According to the algorithm description, DE has three important parameters, that is, SN, , and CR.In order to ensure a good performance of DE, the setting of these parameters should be reasonably adjusted based on specific optimization problems.

Encoding and Decoding.
The encoding of a solution in this stage is composed of  real numbers; so, a solution can be roughly expressed as x = [ 1 ,  2 , . . .,   ].
The encoding scheme is based on the random key representation and the smallest position value (SPV) rule.In the decoding process, the  real numbers  1 , . . .,   (0 <   < ) will be transformed to a production schedule by the SPV rule.In particular, the integer part of   indicates the machine allocation for job , while the decimal part of   determines the relative position of the job in the production sequence.
The decoding process is exemplified in Table 1 with a problem containing 9 jobs.The decoded information is shown in the last row, where (, ) indicates that job  should be processed by machine  at the th position.In fact, the index of the machine that should process job  is simply  = ⌈  ⌉.Hence, job 2 ( 2 = 0.99) and job 4 ( 4 = 0.72) are assigned to machine 1 according to this solution.When several jobs are assigned to the same machine, their relative orders are resolved by sorting the decimal parts.For instance, job 1 ( 1 = 1.80), job 5 ( 5 = 1.45), and job 9 ( 9 = 1.90) should all be processed by machine 2. Furthermore, because  5 <  1 <  9 , the processing order on machine 2 should be (5, 1, 9).Finally, the production schedule decoded from this solution can be expressed as  = (M 1 : 4, 2; M 2 : 5, 1, 9; M 3 : 3, 6, 7, 8).

Evaluation and Comparison of Solutions.
Recall that the objective function is E(TMC), and in this stage, the FOC has been fixed with the setting of production rates.So, we only care about TC and VOC.
In order to evaluate a schedule , we need to implement  under different realizations of the random processing/setup times.When  has been evaluated for a sufficient number of times (), its objective value can be approximated by (7), which is consistent with the idea of Monte Carlo simulation.However, this definitely increases the computational burden, especially when used in an optimization framework where frequent solution evaluations are needed.If we allow only one realization of the random processing/setup times, then a rational choice is to use the mean (expected) value of each processing/setup time.We can show the following property; that is, such an estimate is a lower bound of the true objective value.
Theorem 2. Let  denote a feasible schedule of the stochastic parallel machine scheduling problem.The following inequality must hold: where TMC  (random variable) is the total manufacturing cost corresponding to the schedule, and TMC  (constant value) is the total manufacturing cost in the case where each random processing/setup time takes the value of its expectation.
Proof.In the proof, we will use "" to denote the realized value of the random variable  when all the processing/setup times are fixed at their mean values.
Under the given schedule , we have E(TPT  ) = E(∑ denotes the actual processing time of the th job on machine ), and E(TST  ) = E(∑ denotes the actual setup time between the th and the ( + 1)th job on machine ).Thus, it follows that E(VOC  ) = VOC  .
Under the given schedule , we denote the starting time of job  by   and the completion time of job  by   .As defined earlier,   (resp.,   ) is used to denote the starting time (resp., completion time) of job  when each processing/setup time is replaced by its expected value.First, we will prove that, for any job , E(  ) ≥   .
For the first job on each machine, we have E(  ) =   (because   = 0 a.s.).Then, the proof procedure can be continued for the subsequent jobs on each machine.Suppose that we have already proved E(  ) ≥   for each job before job  on machine , and without loss of generality, we assume that job  immediately follows job  on machine ; then, Therefore, the reasoning applies to each job in the schedule.
Having proved E(  ) ≥   , we can now move to TC in the objective function.Recall that   = (  −   ) + (with  + = max{, 0}) denotes the tardiness of job  and   denotes the tardiness in the case where each random processing/setup time takes its expected value.Meanwhile, suppose that   represents the machine which processes job .Then, This completes the proof of E(TC  ) ≥ TC  .Now that E(VOC  ) = VOC  and E(TC  ) ≥ TC  , we have shown that E(TMC  ) ≥ TMC  .
The DE algorithm requires to compare two solutions in the Selection step.When facing a deterministic optimization problem, we can directly compare the exact objective values of two solutions to tell their quality difference.But in the stochastic case, the comparison of solutions may not be so straightforward because we can only obtain approximated (noisy) objective values from simulation.In this study, we will utilize the following two mechanisms for comparison purposes.
(A) Prescreening.Because TMC  is a lower bound for E(TMC  ) (Theorem 2), we can arrive at the following conclusion which is useful for the prescreening of candidate solutions.Corollary 3.For two candidate solutions x 1 (the equivalent schedule is denoted by  1 ) and x 2 (the equivalent schedule is denoted by  2 ), if TMC  2 ≥ (TMC  1 ), then x 2 must be inferior to x 1 and thus can be discarded.
When applying this property, the value of E(TMC  1 ) is certainly not known exactly, and thus the Monte Carlo approximation based on  simulation replications is used instead.
(B) Hypothesis Test.If the candidate solutions have passed the pre-screening, then hypothesis test is used to compare the quality of two solutions.
Suppose that we have implemented  simulation replications for solution x  whose true objective value is (x  ) = E(TMC   ) ( = 1, 2).Then, the sample mean and sample variance can be calculated by where  ()  is the objective value obtained in the -th simulation replication for solution x  .
Let the null hypothesis  0 be "(x 1 ) = (x 2 )", and thus the alternative hypothesis  1 is "(x 1 ) ̸ = (x 2 )".According to the statistical theory, the critical region of  0 is where  /2 is the value such that the area to its right under the standard normal curve is exactly /2.Therefore, if , the null hypothesis holds), it is concluded that there exists no statistical difference between x 1 and x 2 (in this case, DE may preserve either solution at random).

Stage 3:
Fine-Tuning of   .Up till now, the production schedule (sequence of jobs on each machine) has been fixed by the second stage.It is found that fine-tuning the production rates {  } could further improve the solution quality to a noticeable extent (note that these variables are only roughly optimized on a grid basis in Stage 1).So, here, we propose a local search procedure based on systematic perturbations for fine-tuning {  }.The directions of successive perturbations are not completely random but determined partly according to the knowledge gained from previous attempts.Below are the detailed steps of the local search algorithm, which involves a parameter learning process similar to that of artificial neural networks for guiding the search direction.In the local search process, the optimal computing budget allocation (OCBA) technique [28] is used to identify the best solution among a set of randomly sampled solutions.Before applying OCBA, the allowed number of simulation replications is given.Then, OCBA can be used to allocate the limited computational resource to the solutions incrementally so that the probability of recognizing the truly best solution is maximized.
Step 2. Randomly sample   solutions from the neighborhood of  (ℎ) .To produce the th sample, first generate a random vector r (each component   is generated from the uniform distribution U[−1, 1]), and then let  (ℎ)   =  (ℎ) + r ⋅.
Step 7. If  * has not been updated during the most recent  iterations, then backtracking is executed by letting  (ℎ) =  * .
Step 8. Go back to Step 2.
Step 9. Output the optimization result, that is,  * and the corresponding objective value.
The parameters of the local search module include ℎ max (the total iteration number),  (the reinforcement factor),  (the allowed number of iterations without any improvement), and  ∈ (0, 1) (the amplitude of random perturbation).In addition,   controls the extensiveness of random sampling, and   controls the computational burden devoted to simulation (the detailed procedure of OCBA can be found in [28] and thus is omitted here).In the procedure, Step 6 applies a reinforcement strategy when the previous perturbation direction is beneficial for improving the estimated objective value.
Step 7 is a backtracking policy which restores the solution to the best-so-far value when the latest  perturbations do not result in any improvement.In Steps 2 and 6, the perturbed or reinforced new  (ℎ) should be kept positive.

The Computational Experiments
To test the effectiveness of the proposed three-stage algorithm (abbreviated as TSA later), computational experiments are conducted on a number of randomly generated test instances.In each instance, the processing/setup times (bounded to be positive) are assumed to follow one of the three types of distributions: normal distribution, uniform distribution, and exponential distribution.In all cases, the basic processing times (  ) are generated from the uniform distribution U(1, 100), while the basic setup times (  ) are generated from the uniform distribution U(1, 10).In the case of normal distributions, that is,    ∼ N(  /  ,   [29]) to each machine with speed   = 1.5, and the due date of each job is finally set as its average completion time.This method can generate reasonably tight due dates.Meanwhile, the weight of each job is an integer generated from the uniform distribution U (1,5).As for the machine-related parameters, the fixed cost coefficient   takes an integer value from the uniform distribution U(1, 10), and the variable cost coefficients are directly given as   = 0.01(5 +   ) and   = 0.01(2 +   ).
The following computational experiments are conducted with Visual C++ 2010 on an Intel Core i5-750/3GB RAM/Windows 7 desktop computer.

Parameter Settings.
Since the three optimization stages are executed in a serial manner, the parameters for each stage can be studied independently of one another.
The parameters for Stage 1 include , , , and , which are required by the ordinal optimization procedure.For our problem, we empirically set  = 20 and  = 1 (which means that we want to find one solution that belongs to the top 20),  = 1000 (which means that 1000 solutions satisfying Theorem 1 will be randomly picked at first).On such a basis, the value for  can be estimated by the regression equation given in [25]:  = 45 (which means that we have to select the best 45 solutions from the 1000 according to the crude model, and subsequently each of them will undergo an exact evaluation).Finally, we define the average TMC obtained from 100 simulation replications as the "exact" evaluation for a considered solution; that is, we set  = 100.
When experimenting with the parameters of Stage 2 and Stage 3, we adopt an instance with 100 jobs and 10 machines under normally distributed processing/setup times and  = 0.2.
The parameters for Stage 2 include SN, , and CR, which have full control on the searching behavior of DE.The termination criterion is an exogenously given computational time limit: 30 seconds (otherwise, the generation number and population size would be "the larger the better").We apply a design of experiments (DOE) approach to determine satisfactory values for each parameter.In the full factorial design, we are considering 3 parameters, each with 3 value levels, thus leading to 3 3 = 27 combinations.The DE is run 10 times, respectively, under each parameter combination, and the main effects plot based on mean objective values is shown as in Figure 2 (output by the Minitab software).From the results, we see that SN should take an intermediate value (either too large or too small will impair the searching performance).If SN is too large, much computational time will be consumed on the evaluation of solutions, which reduces the potential number of generations when the computational time is restricted.If SN is too small, the decreased solution diversity will limit the effectiveness of the mutation and crossover operation, which results in deterioration of solution quality in spite of more generations.Generally, setting SN = 50 is reasonable and recommended for most situations.With respect to  and CR, the results indicate that a larger value is more desirable under the given experimental condition.Since these two parameters control the intensity of mutation and crossover, assigning a reasonably large value is beneficial for enhancing the exploration ability of DE.Therefore, we set  = 0.8 and CR = 0.9 in the following experiments.The parameters for Stage 3 include , , , ℎ max ,   , and   .If we still consider three possible levels for each parameter, full factorial design (including 3 6 = 729 combinations) is almost unaffordable in this case.Therefore, we resort to the Taguchi design method [30] with the orthogonal array  27 (3 6 ), which means that only 27 scenarios have to be tested.The local search procedure is run 10 times under each of the orthogonal combinations, and the main effects plot for means is shown in Figure 3 (output by the Minitab software).As the figure suggests, the most suitable values for these parameters are  = 0.05,  = 0.6,  = 20, ℎ max = 100,   = 15, and   = 100.In particular, the desirable setting of  (relative amplitude of local perturbations) tends to be small, because over-large perturbations will make the solution leap around the search range, and it is impossible to fine-tune the solution.The reinforcement step size (), however, would better be set relatively large, which suggests that reinforcement is a proper strategy for optimizing the production rates.The influence of  (time to give up and start afresh) shows that it is unwise to backtrack too early or too late, and the searching process should have a moderate degree of tolerance for nonimproving trials.The impact of   indicates the importance of making a sufficient number of samplings around each visited solution.The best selection of   reflects the effectiveness of OCBA, which can reliably identify the promising solutions with a relatively small number of simulation replications and thus makes it possible to keep   low for saving computational time.

The Main Computational Results
. Now, we will use the proposed three-stage algorithm (TSA) to solve different-sized problem instances.The results are compared with the hybrid meta-heuristic algorithm PSO-SA [31], which uses simulated annealing (SA) as a local optimizer for particle swarm optimization (PSO).PSO-SA also relies on hypothesis test to compare the quality of stochastic solutions, which makes it comparable to our approach.Although PSO-SA was initially proposed for stochastic flow shop scheduling, the algorithm does not explicitly utilize the information about machine environments.In fact, PSO-SA can be used for almost any stochastic combinatorial optimization problem.Therefore, PSO-SA can provide a baseline for comparison with our algorithm.The implemented PSO-SA for comparison optimizes the production rates and the production schedule at the same time (by adopting an integrated encoding scheme the first  digits express machine speeds and the last  digits express job sequences).The parameters of the PSO-SA have been calibrated for the discussed problem and finally set as follows: the swarm size   = 40, the inertia weight  = 0.6, the cognitive and social coefficients  1 =  2 = 2, the flying speed limitations V min = −1.0,V max = 1.0, the initial temperature  0 = 3.0, and the annealing rate  = 0.95.
In order to make the comparison meaningful, the computational time of PSO-SA is made equal to that of TSA.Specifically, in each trial, we run TSA first (DE is allowed to evolve 500 generations) and record its computational time as CT, and then we run PSO-SA under the time limit of CT (which controls the realized number of iterations for PSO-SA).
Tables 2, 3, and 4 display the optimization results for all the test instances, which involve 10 different sizes (denoted by (, )), 3 distribution patterns (normal, uniform, and exponential), and 3 variability levels ( ∈ {0.1, 0.2, 0.3}).Each algorithm is run for 10 independent times on each instance.In order to reduce random errors, 5 instances have been generated for each considered scenario, that is, each combination of size, distribution, and variability except for exponential distribution whose variance is not independently controllable.For each instance, the best, mean, and worst objective values (under "exact" evaluation) obtained by each algorithm from the 10 runs are, respectively, converted into relative numbers by taking the best objective value achieved by TSA as reference (the conversion is simply "the current value/the best objective value from TSA").Finally, these values are averaged over the 5 instances of each scenario and listed in the tables.
Based on the presented results, we can conclude that TSA is more effective than the comparative method.The relative improvement of TSA over PSO-SA is greater when the variability level () is higher.This suggests that a multistage optimization framework is more stable than a single-pass search method in the case of considerable uncertainty.The proposed algorithm implements the optimization process with a stepwise refinement policy (from crude optimization to systematic search and then to fine-tuning) so that the stochastic factors in the problem can be fully considered and gradually utilized to adjust the search direction.In addition, TSA outperforms PSO-SA to a greater extent when solving larger-scale instances.The potential search space grows exponentially fast with the increase of job and machine numbers.The advantage of TSA when faced with large solution space is that it utilizes the specific problem information (like the properties described by Theorem 1 and Corollary 3), which promotes the efficiency of evaluating and comparing solutions.By contrast, PSO-SA performs the search process in a quite generic way without special care about the structural property of the studied problem.Therefore, the superiority of TSA can also be explained from the perspective of the No Free Lunch Theorem (according to the No Free Lunch Theorem (NFLT) [27], all algorithms have identical performance when averaged across all possible problems; the NFLT implies that methods must incorporate problem-specific information to improve performance on a subset of problems).
To provide more information, we record the computational time consumed by TSA when solving the instances with normally distributed processing/setup times and  = 0.2.The time distribution among the three stages is also shown as percentage values in Figure 4.As the results show, the percentage of time consumed by Stage 1 decreases as

Sensitivity Analysis for Cost Coefficients.
The operational cost is directly affected by the following input parameters: the variable cost coefficients   and   (these reflect the operating cost to support the workings of the factory for a unit time, for example, the unit-time fuel cost, water and electricity fees, and the hourly wage rate), and the fixed cost coefficient related with each machine   (these are related with the cost to support the normal operation of machines for a period of time, for example, the investment on automatic status monitoring and early warning systems).These parameters are fixed at constant values in the short term (so, they have been treated as inputs for our problem), but they may be changed in the long run as the firm gradually increases the investment on the production equipment and manufacturing technology.For example, when new energysaving technology is introduced into the production line, the variable cost coefficients   and   (which measure the cost incurred when a machine is working in production or setup mode for one hour) will be reduced to some extent.However, the introduction of new technology needs money; so, the question is how much investment is rational and economical for the firm to improve these long-term variables?Sensitivity analysis can provide an answer for such questions.
As an example of sensitivity analysis, we will focus on the impact of   on the setting of   .The (400, 10) instance under normally distributed processing/setup times and  = 0.2 is used in this experiment.The value of   varies from 0.01(1 +   ) to 0.01(10 +   ) (10 levels), and under each value of   , we run the proposed TSA for 10 independent times to get 10 optimized solutions (in the process of switching   , all the other input parameters are kept at their original values).For each solution  that is output by the -th execution of TSA, we calculate the average value of  among all machines as   = (1/) ∑  =1   , and we record the corresponding total cost as TMC  .Finally, we calculate the averaged  in the 10 final solutions as (  ) = (1/10) ∑ 10 =1   and the averaged total cost as TMC(  ) = (1/10) ∑ 10 =1 TMC  .The results are displayed in Figure 5.According to Figure 5(a), there is a clear rising trend in the total cost as the cost coefficient   increases.This is no surprise because   is an indicator of cost per unit time.Moreover, the slope of the regression line is 319.95, which suggests that reducing the value of   by one unit will result in a saving of 319.95 units (on average) in the total cost.Therefore, the firm should be willing to invest at most 319.95 for reducing the cost coefficient   by one (e.g., by promoting the energy efficiency of the production lines).
By observing the impact of   on the optimized production rate  (Figure 5(b)), we can obtain similar information.For example, if the value of   has been decreased by one, the optimal setting of  for each machine should be decreased by 0.1093 (on average).The underlying reason is that when the unit-time production cost decreases, the production pace does not need to be hurried to the original extent, and meanwhile, reducing  reasonably can help cutting down the fixed operational cost.
Finally, we examine the impact of the functional form used to describe the fixed operational cost.Recall that currently the fixed operational cost is defined as FOC  (  ) =   ⋅  2   , and that the square on   is used to simulate the accelerated increase of the cost with the production rate.Now, we vary the power of   from 1 to 4 and run the optimization procedure in each case.The averaged  values (calculated as earlier) for the same instance are shown in Table 5.From the results, we see that as the power increases, the optimal settings for the production rates exhibit a downward trend.In practice, the detailed functional form for the fixed cost should be specified according to the historical production data.

Conclusions
In this paper, we consider a stochastic uniform parallel machine production system with the aim of minimizing total  manufacturing cost.The production rates of the machines are adjustable; so, they are treated as decision variables to be optimized together with the detailed production schedule.
In accordance with the principle of simulation optimization, we propose a three-stage solution framework based on stepwise refinement for solving the stochastic optimization problem.The proposed algorithm is verified by its superior performance compared with another metaheuristic designed for stochastic optimization.Also, the procedure for sensitivity analysis is discussed.The future research may extend the main ideas presented here to more complicated and realistic production environments like flow shops and job shops.

Figure 1 :
Figure 1: Splitting of the original job sequence in the cases  1 = 1.0 and  2 = 1.5.

Figure 2 :
Figure 2: The impact of Stage 2 parameters.

Figure 3 :
Figure 3: The impact of Stage 3 parameters.
The percentage of time consumed by each stage

Figure 4 :
Figure 4: The computational time of TSA.
The processing time    of job  on machine  is    =   /  , where   is the operating speed of machine .(iii) Unrelated Parallel Machines (R).The processing time    of job  on machine  is    =   / , , where  , is the job-dependent speed of machine .
The processing time    of job  on machine  is identical for each machine; that is,    =   .(ii)Uniform Parallel Machines (Q).

Table 1 :
Illustration of the decoding process in DE.
2  ) and    ∼ N(  /  ,  2  ), the standard deviation is controlled by   =  ×   and   =  ×   ( ∈ {0.1, 0.2, 0.3} describes the level of variability).In the case of uniform distributions, that is,    ∼ U(  /  −   ,   /  +   ) and    ∼ U(  /  −   ,   /  +   ), the width parameter is given by   =  ×   and   =  ×   .In the case of exponential distributions, that is,    ∼ Exp(   ) and    ∼ Exp(   ), the only parameter is given by    =   /  and    =   /  .The due dates are obtained by a series of simulations which apply dispatching rules (such as SPT and EDD

Table 2 :
The computational results under normal distributions.

Table 3 :
The computational results under uniform distributions.

Table 4 :
The computational results under exponential distributions.

Table 5 :
The impact of the functional form of FOC  .