Biobjective Scheduling for Joint Parallel Machines with Sequence-Dependent Setup by Taking Pareto-Based Approach

Modern factories have been moving toward just-in-time manufacturing paradigm. Optimal resource scheduling is therefore essential to minimize manufacturing cost and product delivery delay. This paper therefore focuses on scheduling multiple unrelated parallel machines, via Pareto approach. With the proposed strategy, additional realistic concerns are addressed. Particularly, contingencies regarding product dependencies as well as machine capacity and its eligibility are also considered. Provided a jobs list, each with a distinct resource work hour capacity, this novel scheduling is aimed at minimizing manufacturing costs, while maintaining the balance of machine utilization. To this end, different computational intelligence algorithms, i.e., adaptive nearest neighbour search and modified tabu search, are employed in turn and then benchmarked and validated against combinatorial mathematical baseline, on both small and large problem sets. The experiments reported herein were made on MATLABTM software. The resultant manufacturing plans obtained by these algorithms are thoroughly assessed and discussed.


Introduction
With the recent advances in modern intelligent manufacturing, most industrial works have increasingly been adopting just-in-time (JIT) strategy [1,2]. With this strategy, manufacturing cost and delivery delay are optimized by means of meticulous production planning. Among prevailing machining approaches presently taken by modern factories, unrelated parallel machine (UPM) [3][4][5][6] system is investigated in this study. In the UPM system, a factory consists of several machines operating the same task but taking different time durations. Examples of these factories are lathes and sawmills. Upon commencing any task or restarting an alternate one, an operator often has to prepare the machine by making appropriate configurations and settings. They include fitting new moulds, replacing equipped tools, and cleaning contaminated parts. These activities generally incur additional cost, also known as setting up cost. They are expressed in terms of spent time (and/or money), whose values may be constant or varied as the preceding task. More specifically, a sequence-independent setup (SIS) [7,8] remains constant regardless of the previous tasks operated on the same machine, whereas a sequence-dependent setup (SDS) does not [4,5,9,10]. In addition, while some tasks can well be performed on one machine, they may be prohibitive on others. For instance, coarse milling may be performed on all available machines, while fine milling is only possible on specific ones. Another concern faced in typical industrial practices is unplanned maintenance (UM) [11] due to faulty resources, especially after a schedule has been issued.
Provided a list of required productions, i.e., jobs, each with associated resource work hour capacity, this paper is aimed at devising an appropriate JIT manufacturing plan. Without loss of generalization ability, UPMs involved in this study were assumed to be of SDS type, where all setups could be made equal, otherwise. It took into account production variables, typically found in practices, e.g., time and money required to make an initial machine configuration, given precedent tasks, machine standard time, and storage costs for products completely in order to meet requested demand each period (all of which were in integers). The optimal machine scheduling was determined such that the resultant plan incurred minimum manufacturing cost, while maintaining machine utilization balance. In this study, it was further assumed that once started, a task could not be overridden or cancelled, until it was fully completed. To this end, the state-of-the-art computational intelligence algorithms, that is, adaptive nearest neighbour search (ANNS) [12,13] and modified tabu search (MTS) [14][15][16][17], were employed and assessed in turns. The resultant scheduling for small and large problems was subsequently validated against an exhaustive baseline model. This paper focuses on biobjective unrelated parallel machine scheduling. Its emphasis is placed on minimizing manufacturing costs, while maintaining the balance of machine utilization, based on Pareto optimality. Its main contribution is to remedy prohibitively high complexity of theoretical models by employing heuristic and metaheuristic approaches, namely, ANNS and MTS. To elucidate its merit, realistic instances of JIT manufacturing environments with practical conditions were explored in simulations.
The remaining of this paper is organized as follows. Section 2 reviews the literature and the related works. It provides detailed accounts and critical discussion on machine scheduling problems and state-of-the-art solutions. Section 3 describes the proposed scheme by first outlining its key processes, followed by their description and assumptions made herein. Section 4 describes the experiments on the abovementioned algorithms, including the characteristics of data involved and relevant assessments. Subsequently, this section also demonstrates the merits of the proposed scheme by objectively reporting and discussing the resultant scheduling, based on designated performance metrics. Finally, Section 5 makes the concluding remarks on the contribution of this study and its prospects.

Literature Review
This section focuses on recent research on conditional and constrained scheduling. In order to devise an optimal manufacturing plan that minimizes its overall costs, including those incurred by configuring the machines during the production process and by inventory storage, various approaches have been taken in the literature. They can be categorized into those based on solving for an exact solution of some mathematical model and on approximating one by computational intelligent techniques. The following subsections start from background on scheduling theory. After that, the definitions of parallel machine scheduling and its mathematical model are described. On solving such a model, various optimization strategies, both with single and multiple objectives, are subsequently reviewed. Finally, the recent and closely related research works to the proposed scheme are discussed.

Background on Scheduling.
Scheduling is one of the management schemes that attempts to allocate limited resources for completing a mission within given timeframe, possibly under some constraints [18]. In an industrial context, a preferable solution to this problem is determined by optimizing resources' utilization (i.e., manpower and machines) with respect to predefined objectives [6,19,20]. This paper proposed a novel solution to a scheduling problem of a manufacturing system, which was characterized as follows.
The system consists of a set of n jobs and m machines, denoted by I = fi | i ∈ ½1, ng, J = fj | j ∈ ½1, ng, and M = f m | m ∈ ½1, MCg, where i and j are the indices of job and m is the indices of machine, respectively. The notations of key parameters involved in the analyses are listed in Table 1.
Scheduling strategies can be divided into flow shop and job shop scheduling [21]. The former consists of consecutive machines or job stations, and each performs different operations. All the jobs are processed by each machine in an exact same sequence, called flow, whereas in the latter group, scheduling is made for each job, whose flow consists of a unique sequence of operations. On solving a scheduling problem, there exist three techniques normally adopted. First, a mathematical model can be used to find an optimal scheduling by means of, for instance, integer, mixed integer, or dynamic programming [22][23][24]. Objectives typically considered in the optimization are flow time, makespan, lateness, number of tardy jobs, tardiness, etc. [25][26][27]. As the problem size increases, however, they suffer from excessive complexity. Second, dispatching rules or heuristic techniques are especially designed to reduce complexity while offering acceptable results within reasonable time. Given a set of jobs, these techniques imposed one or more criteria, e.g., first come first served (FCFS), shortest processing time (SPT), longest processing time (LPT), and earliest due date (EDD) [25,28]. Techniques in this group also consider similar objectives that do those in the former one. Third, neighbourhood search finds an optimal solution more efficiently, especially for larger problems, by incrementally updating the current best solution, taking into account information from its neighbours. They include tabu search, simulated annealing, and genetic algorithms [14,29,30].
Furthermore, scheduling also differs by machine layouts. Single machine scheduling is trivial and usually adopted in decomposition of a more complex manufacturing. On the other hand, for the parallel machines, single-process jobs are released to a pool of machines working in parallel. The scheduling problem is now reduced to making two decisions, i.e., job allocation and its sequence. This type of scheduling can be further divided by the functions of associated machines, which are identical (IPM), nonidentical (NPM), and unrelated parallel machine (UPM) [3][4][5][6]. This research focuses on UPM, where each machine can process a given job at a different speed. Since a machine can process different jobs, when starting a new one, a setting up is usually needed [7][8][9][10]. Therefore, time and cost used in this operation depend on involved jobs and their sequence.

Mathematical Models of Parallel Machine Scheduling.
Provided a problem of scheduling a set of n jobs (I and J) onto m parallel machines (M), as described in Section 2.1, a mathematical model attempts to plan an optimal sequence 2 Modelling and Simulation in Engineering of job assignments with respect to some criteria. In the present context, two conditions are imposed on the system. That is, some jobs are not assignable to some machines and each job must be assignable to at least one machine. In addition, since a sequence-dependent setup was assumed, a setup time S ði, jÞ ≥ 0, used in preparing for a job j after having completed a job i, was considered. Let, at the beginning for a job i, there be no setup involved, that is, S ð0, iÞ = 0. Then, an optimal scheduling that minimizes the overall makespan may be posed as an integer programming problem. Following a logistics model proposed by Tong et al. [31], we adopted an objective that minimizes the total processing, storage, and setup costs for all jobs and periods, i.e., where g jt and C jt are the unit processing cost and counts of job j at period t, o jt and H jt are unit inventory cost and stored job j at period t, s ij is a cost of setting up a machine for job j after i, and X ijmt is 1 when a job j is processed right after a job i on machine m, at period t or 0 otherwise. Detailed descriptions of problem-specific constraints will be given in Section 3.1.

Optimization Strategies.
With the emergence of modern computer and information technology, optimization has been incorporated into an extensive range of applications, from engineering [32,33], medicine [34], geographical science [35], to finance [36]. In industrial engineering (IE), an optimization strategy is normally employed in production and logistics planning [18]. Enumerative search, for example, is found applied to solve an exact solution to scheduling or assignment problems. The most prevailing methods in this group include dynamic, linear, and quadratic integer programming [22][23][24]. Thus far, these methods are only suitable for small problem, because their complexity is considered NP-Hard [37] and thus exponentially increases as the problem size. To remedy complexity issue, approximation-optimization approach is proposed and normally called metaheuristics. Methods in this group approximate an NP-Hard solution by imitating adaptation of some entities found in nature. They are simulated annealing (SA) [29], tabu search (TS) [14][15][16][17], neural network (NN) [38], genetic algorithm (GA) [31,39], and ant-colony optimization (ACO) [40]. This approach, however, is considered a black-box and thus fails to give any insights into the problem at hand and its behaviour. Experiences on the problem can be incorporated into optimization. This approach, called heuristic optimization, can be divided into 2 groups. With incremental generation, solutions are gradually formed into a final solution, while the neighbourhood search strategy a solution that is not contradicting with predefined conditions is created in turn and, if it gives better results than a previous one, is maintained. Some methods in this group worth mentioning here are the nearest insertion, Petal, sweep, cluster first route second, and route first cluster second [41][42][43].
With the heuristic optimization, a solution is obtained via a process of learning by investigation, assessing the feedback and making a decision accordingly. Thus far, when experiences are imposed on a problem, the resultant strategy may not be suitable or even applicable to unseen ones. Although they do not guarantee the best solution, their key advantages are much simple simulation, while offering a "good enough" solution to a problem that may not be well structured.
Thanks to those preferable characteristics, the heuristic approach has been taken in many multidisciplinary engineering applications. A widely employed method, called variable neighbourhood search (VNS), was first proposed by Hansen and Mladenović [44] and applied to several complex problems [45,46]. Its main strategy is avoiding local traps by adjusting neighbourhood structures (NS), if a newly found solution is worse than an existing one or trapped in local minima. It was shown therein that the adjustment increases the likelihood of finding a better solution than local updates without one. Moreover, its implementation is trivial, and it requires only a few parameters. More specifically, given a starting point, Q, VNS defines a searching domain of order n ∈ ½1, n max , within its neighbourhood, NS ðQÞ. It then randomly finds an improved solution (shaking), Q ′ , within that domain. Subsequently, it searches for a local optimum Q ′ ′, around Q ′ . This can be done by, for example, perturbation, basic (BLS), or iterated local search (ILS). If this gives a better result, then the optimal solution is updated Q ⟵ Q ′ ′. Otherwise, the domain order is incremented, n ⟵ n + 1.
Since it was proposed, VNS has been applied to both linear and nonlinear integer and mixed-integer programming [47].
Tabu search is a hybrid discrete optimization strategy that relies on neighbourhood search and tabu list (TL), used to store previous solutions. TL can prevent getting locked in local trap by aspiration criteria (AC). Its search strategy is 3 Modelling and Simulation in Engineering deterministic and specified by recency and frequency conditions. In addition, the performance can be enhanced by two mechanisms, i.e., intensification and diversification. Provided a search space and radius (R), counter, a TL, and termination criteria (TC), tabu search algorithm starts by randomly picking an initial solution, S0 within the search space. It then randomly gathers N neighbours within radius R around S0 and stores in a set S1 ðRÞ. An objective function is evaluated in turn for each point in this set, and the one that gave the best (minimum cost) solution is marked as S 1. If S1 ≤ S0, the previous solution, S0 is stored in the TL and the solution is updated, S0 ⟵ S1. Otherwise, S1 is stored in the TL. The process is repeated until TC are met, and the optimal solution is the current S0. Basic TS is rather slow and can get trapped in local minima. Adaptive tabu search (ATS) that incorporates backtracking and adaptive radius mechanisms was proposed to elevate the issues [48]. With ATS, after a new solution is updated, backtracking is invoked when it is locked by local solution. Search radius is also adapted as it reaches convergence.
On solving biobjective scheduling problems, this paper used both heuristic and metaheuristic methods, respectively, called adaptive nearest neighbour search (ANNS) and modified tabu search (MTS). Pareto optimality of the final production plans and their performance metrics were then validated against enumerative search.

Single-versus Multiobjective Optimization.
Single-objective optimization is aimed at finding the best solution to a problem, under specified constraints. Generally, it consists of three components, i.e., vector of decision variables, constraints, and an objective function. Unlike its single counterpart, multiobjective optimization finds, within feasible regions, the best solutions with respect to more than one objective functions, whose values may be maximized and/or minimized. As described in Section 2.1, a scheduling is an NP-Hard problem, whose best solution is not always feasible with a typical algorithm. Much research in this area thus opts for its approximation set instead. This approximation usually involves 2 processes [49,50], i.e., fitness assignment and population diversification. This is to ensure that the approximated solution is close to the exact one and is uniformly distributed, from one end of the domains to another. Several approaches were taken to assign a fitness, e.g., goal programming, vector evaluation, Goldberg or nondominated sorting, Fonseca and Fleming sorting, and accumulate ranking density strategy (AARS) [51][52][53].
In the proposed biobjective strategy, for instance, our primary objective function was to minimize all costs incurred by production. To ensure the proper distribution of decision variables, a secondary objective that aims at balancing machine utilization was incorporated.
2.5. Related Works. Scheduling on parallel machines has attracted much interest in the past decades. A range of strategist and objectives have been proposed in the literature. Ruiz-Torres et al. [54], for instance, scheduled different parallel unrelated resources, aiming at reducing the number of late jobs. In that work, both machines with different process times and varying numbers of line staffs were allocated at a given period. They divided the problems into two scenarios, which were parallel machine flexible resource scheduling (PMFRS) and unspecified parallel machine flexible resource one (UMFRS). The problems were solved by a computer program. On scheduling unrelated parallel machines, Kim et al. [55] employed different objectives, taking into account both setup time and total weighted tardiness. They took a heuristic approach with two objectives, which were earliest weighted due date (EWDD) and shortest weighted processing time (SWPT), and two optimizers, namely, two-level batch scheduling (TLBS) and simulated annealing (SA). Edis and Oguz [56] studied unrelated parallel flexible machines and proposed two mathematical models, called PMFRS and UPMFRS, but aiming at minimizing the completion time. Much recently, focus has been moving onto optimization strategies. Polyakovskiy and Hallah [57] studied multistage scheduling by considering earliest weighted late jobs of parallel machines. It was observed that each job required a different process time and was to be delivered at a different due date. Therein, the mixed integer programming (MIP) method, called MASH, was employed to solve the bottleneck problem in multistage scheduling. A just-in-time (JIT) scheduling approach was taken by Kayvanfar et al. [1,58] to minimize total tardiness and the number of early completed jobs. The overall cost thus depended on whether jobs were completed earlier or later than specified due dates. Similarly, an MIP method was employed. It was assumed that there was no job insertion and the unrelated parallel machines had different processing rates. Nonetheless, their method suffered from the problem size and, without hybrid integration, is suitable for only small ones. With similar machine condition, Zhang et al. [59] minimized weighted average tardiness by means of reinforcement learning (RL). In their experiments, release time and due date were randomly specified, and the resulted scheduling was found to outperform all the methods being benchmarked.
In addition, there have been a number of most recent studies, focusing on parallel machines, JIT approaches, multiobjective strategies, and the combinations of these areas, and hence worth explored here. Majority of early works assumed single objective strategy [60][61][62][63][64][65]. In 2014, Kayvanfar and Teymourian [60] proposed an intelligent water drop (IWD) algorithm to schedule unrelated parallel machines. It was validated on five machines, with small and large numbers of jobs. However, it did not consider setup time. This work was later extended to account for not only identical [61] but also unidentical [63] machines. The former still followed previous optimization strategy, whereas the latter proposed a parallel net benefit compression-net benefit expansion (PNBC-NBE) algorithm. These methods differed from their precedence in that the former considered JIT manufacturing with controllable process time, while the latter focused on sequence-related setup times. Another similar work was proposed by Lin and Ying [62]. They focused on a new optimization algorithm, called hybrid artificial bee colony (HABC), which was compared against TS, nature inspired, and RSA optimizations. It was validated on bigger problems. Since then, a number of works explored various 4 Modelling and Simulation in Engineering metaheuristic optimization strategies [64,65] on similar problems. Thus far, these methods did not consider cycle time. Biobjective scheduling for batch processing with no setup time was considered in subsequent attempts [66,67]. Similar to the previous works, they also used metaheuristic optimization. Taking into account sequence-dependent setup time, Yepes-Borrero et al. [68] proposed minimizing both makespan and number of resources, by using greedy algorithm. On solving multiobjective scheduling problems, Kayvanfar et al. [69,70] used SA and GA as optimizers, for unrelated and identical parallel machines, respectively. Neither setup time nor cycle time was considered in those works, but the latter considered JIT approach. At least one or more limitations are shared by the abovementioned works and different from ours. They include the omissions of sequence-dependent setup time, process time, and cycle time, as well as smaller numbers of objectives.

Proposed Method
This paper focuses on scheduling and its analyses on unrelated parallel machines, whose setup times were sequence dependent. Furthermore, it was assumed that assigning a job to any machine is subject to its predefined eligibility and that deliveries were made at production intervals. This total cost minimization problem was solved by using both mathematical model and computational intelligence methods. Later, Pareto strategy [71] for biobjective problem was considered, by integrating balanced machine utilization into the cost functions. Likewise, the scheduling results obtained by the proposed computational intelligence methods were compared against those by the mathematical model baseline.

Mathematical Model for Scheduling
Problems. This section describes a mathematical model referred to as a baseline in benchmarking. With this model, it was assumed that (1) machines were unrelated and parallel, (2) their setup times depended on production sequence, and (3) their eligibility for given jobs was prespecified. The objective of their scheduling was to minimize the makespan by using an integer programming method. The input variables were scheduling or assignment table (X) and quantity units of processed jobs (N) at each period. Given these variables, completed (C) and quantity inventory units stored at the end of period (H) would be then determined. Finally, the cost function (Z) would be evaluated, given system parameters, from the numbers of assigned, produced, and stored jobs, as expressed in Equation (1). Specifically, the system parameters were (sequence dependent) setup times (S), unit production (G), and storage (O) costs, respectively.
To ensure realistic scheduling, there are some constraints worth considered and described as follows.
In each production round, the quantity units of processed job j must meet delivery demand, while the total production time must fall within specified working hours, as expressed in In addition, the quantity units of processed job j produced on those machines in total (all periods, t) must be a positive integer and was no less than the lower lot size. Finally, a job assigned to a machine must be within its capacity. These constraints are given in Equations (3) and (4), respectively.
With this scenario, provided the problem parameters, i.e., jobs, working period and hours, and machines, as well as resources' data, i.e., production and release time, demand, machine eligibility, production and storage unit costs, and setup cost, scheduling gives an optimal binary production table (X ijmt ), as well as corresponding quantity units of each processed job on each machine (N jm ) total complete time (C j ), at each period. To maintain valid production table, all elements were asserted by auxiliary variables (A jmt , B jmt ) as defined in Equations (5) and (6). This ensures, for instance, that a machine must be assigned with at least one job, as given in Equation (7).
During the integer programming, plausible scheduling was subject to specific constraints, as follows. The quantity units of stored job j at period t, H jt , were equal to the job j previously stored (H jt−1 ) and currently processed on all machines but subtracted by the amount required. That is, Provided that a machine was able to process both jobs i and j and that job j was processed right after job i, then the time that job j (C jt ) was completed was equal to the time when a previous job (C it ) was completed, plus sequence dependent setup time (S ij ) and time used to process the specific amount of that job (P jm · N jmt ). The resulted complete time (C jt ) must be within working hours of that period. Furthermore, when subtracted by setup and process time, it should be later than its release time (R j ). That is, ∀ i,j,m,t and i ≠ j, j ≠ 1,

Modelling and Simulation in Engineering
Regarding a job sequence, only one job j could succeed another job i, and it should be processed by only one machine, and vice versa. In other words, at a given period, a job might not be distributed to different machines. Furthermore, preceding and succeeding jobs must be different, or a job could not be processed if it had just been completed on that machine. These constraints are realized by Equations (12)-(15), respectively.
To ensure utilization constraints, a machine should have at least one first job and one last, maybe different, job, as expressed in Equations (16) and (17), respectively. Likewise, a job should be assigned first and last, each time to at least one, maybe different, machine, as expressed in Equations (18) and (19), respectively. Lastly, any given job must be assigned to only one machine, as expressed in Equation (20).
To maintain integer computability, values in scheduling tables were only binary, i.e., 1 or 0, depending on whether any assignment of a job after completion of another one was eligible for that machine at that period or otherwise, respectively. Finally, numbers of processed and stored jobs and that of process time per job must be positive integers. Since these constraints are self-explanative, their detailed expressions are hence omitted.
The outputs of scheduling process were, for a given job i and at period t, (1) number of stored job (H it ), (2) quantity units of processed jobs at a machine m (N imt ), and (3) time spent on processing the job (C it ).

Computational Intelligence
Methods. This section describes the proposed heuristic and metaheuristic methods for solving unrelated parallel scheduling problem. Herein, we employed adaptive nearest neighbour search (ANNS) and modified tabu search (MTS). Their processes and formulations in the present context are provided in the following subsections.
3.2.1. Adaptive Nearest Neighbour Search (ANNS). The ANNS starts by specifying an initial solution and a best solution table. Upon entering the updating loop, neighbouring solutions with respect to job sequence (S) and processing demand (D) were created, adaptively. More specifically, the neighbours were defined by offsetting a current solution by distances of a 2 0 -2 2 , multiplied by adaptive step size, i.e., where i s and i d were the indices to job sequence pair and processing demand tables, respectively, whose members were prepopulated by all possible combinations of the respective variables. Specifically, for a problem with four jobs, there were 4! = 24 possible sequences. The processing demand was computed for each period and product from its actual demand subtracted by that already processed and stored in an inventory. The values were lower bounded by economic order quantities (EOQ) and lower lot size. In addition, S and D were created neighbours and W s and W d were adaptive steps, of the job sequence and the processing demand indices, respectively. Subsequently, objective functions were evaluated, given each neighbour in turns, and the best solution would be appended to the best solution table. At each iteration, solutions in this table would be sorted by their objective functions. If the size of this table was greater than a predefined value, the worst solutions so far would then be discarded. Later, the remaining ones were assessed, and if there were excessive numbers of duplicated solutions, then the adaptive steps (W s,d ) were adjusted. The best solution would be chosen in the next iteration. This ANNS process was iterated until convergence or the number of rounds reached a specified limit.

Modified Tabu Search (MTS).
Similarly, MTS also started by specifying an initial solution and a best solution table, but it also created a tabu table. Adaptive table indexing of neighbours was evaluated following that of ANNS. Evaluation of objective functions and updating of the best solution table were also the same as before. However, once the best solution table had been updated, the tabu list was modified. If the best solution in the current round was no better than the previous ones, it would be added to the tabu list. This would effectively enable backtracking solutions to existing 6 Modelling and Simulation in Engineering tabu members, should it be locked in local minima. Likewise, this process was iterated until convergence or the number of rounds reached a specified limit. The diagram of ANNS and MTS processes is illustrated in Figure 1. 3.3. Pareto-Based Approach toward Biobjective Scheduling. It will be later demonstrated that, with single objective function, although production cost (Z), expressed in Equation (1), was optimized, the machine utilization was not. The machine utilization was defined as a ratio between its operational and total working hours. Therefore, in the present context, another objective was to maximize the minimum makespans (U), normalized over all involving machines, that is, To balance between production cost and resource utilization, the main contributions of this paper are to devise a biobjective model of an unrelated parallel machine scheduling problem and then to obtain their Pareto solutions. Consider a 2-objective minimization problem: A solution X is said to dominate a solution Y if Z ðXÞ ≤ Z ðYÞ and U ðXÞ ≤ U ðYÞ, and there is at least one of these functions, where X yields strictly lower value than does Y. Solution X is called Pareto optimal if it is not dominated by any other feasible solutions. To this end, three Pareto strategies are proposed.
(a) Pseudo-Pareto Optimal. A set of some optimal production plans with respect to the primary objective, i.e., production cost, was collected. Among these plans, the one with optimal utilization was selected.
(b) Parallel Pareto Optimal. Both primary and machine utilization objective functions were evaluated, and a set of optimal scheduling with respect to each of these objectives was collected, separately. Upon convergence, the optimal one from both groups was selected.
(c) Serial Pareto Optimal. The production objective function was first evaluated, but an optimal one would be admissible in the candidate list, only if its utilization was also within an acceptable range. Again, upon convergence, the optimal one was finally selected.

Experiments.
Herein, three types of scheduling problems, which were reference, and those with small and large sizes, were considered. For each problem, the input parameters were number of jobs (nJob), number of periods (nPeriod ), number of machines (nMachine), working hours (WorkHour), and order or lot size (LotSize). The conditional parameters were production and release time, demand, machine eligibility, and the unit costs for production, storage, and sequence-dependent setup. Following recommendations made by Afzalirad and Rezaeian [72], the parameters were specified as listed in Table 2.
For each problem, their overall cost and scheduling times were compared, among integer programming of the mathematical model (3.1), ANNS (3.

2.1), and MTS (3.2.2).
To obtain the results reported as follows, the mathematical models were solved by LINGO 11.0 software, while the ANNS and MTS were implemented on MATLAB™. Both programs were run on a personal computer (PC), installed

Results and Discussion
This section reports experiment results and relevant discussion on three scheduling methods, namely, the mathematical model, ANNS, and MTS. They were evaluated by reference, small, and large problems.  Table 3. Note that, with 09J02M06P case, processing steps and time taken were extremely long and consumed excessive resources on our system. Its details were hence not included in the table (case 4 * ). The values in objectives column are global optimum found in each case.
It is evident from Table 3 that as the problems got slightly more complex, for instance, from 6 to 9 jobs, the solver steps and hence scheduling time exponentially increased. In fact, with our setting, the integer programming process took about 4 days already to complete. This method is therefore not suitable for larger scheduling problems, and alternatives would be required, especially for actual JIT manufacturing. Figure 2 depicts an example of scheduling for the 09J02M03P, while Figure 3 provides, at each period, the number of jobs being ordered (demanded) and stored in the inventory and those being processed on each of the two machines. Note from both figures that only eight real jobs appear. This is due to the first job being exploited as a dummy to satisfy the first and last job constraints, as described in Section 3.1.
In the subsequent experiments, ANNS and MTS were benchmarked against the mathematical model on the same problems. However, because both ANNS and MTS involved uniformly random initializations, they were each executed for six runs. The objective functions and scheduling time for all those runs and the respective averages as well as the differences than those obtained by the mathematical model are listed in Table 4.
Evidently, with small problems (06J02M03P), ANNS and MTS took similar processing time to integer programming on the mathematical model. Even when these problems getting much complicated and hence the mathematical model failed to schedule within reasonable amount of time, computing time required by ANNS and MTS remained roughly unchanged, while giving similar objective values to the mathematical model. Furthermore, closer inspection on the deviations of their objective values from those obtained by the mathematical model was performed. The results are plotted in Figure 4. It is observed that ANNS results were much consistent and closer to the baseline in cases of fewer jobs. The opposite is true, however, for bigger problems.

Small Problems.
Experiments on small problems are those that could be completed within 100 hours. Ten problems were created following the prescriptions in Table 2. They were, for example, 05J02M06P, 07J02M06P, and 08J03M09P. For each case, six production cases were generated, resulting in sixty test cases in total. Figure 5 compares the results obtained by three methods. The graph on the left plots the objective values optimized by mathematical model, ANNS, and MTS, respectively. On the right-hand side,     Figure 4: The solutions by ANNS and MTS algorithms compared to those by the mathematical models. and 30J10M06P, were empirically chosen, as listed in Table 5.
Due to the sizes of these problems being prohibitive for mathematical model, it was thus discarded from the experiments. Similarly, for each problem, four different parameters were randomly specified as per Table 2, resulting in sixty cases in total. Unlike Section 4.2, averaged objective functions and scheduling time over each problem are plotted in Figure 6.
The above results indicate that, with a single objective, consisting of periodical production, inventory storage, and sequence-dependent setup cost, ANNS and MTS required significantly less processing time than the mathematical model method. Particularly for large problems, they took just about 60 seconds on average, instead of 100 hours. Meanwhile, ANNS and MTS methods could schedule these problems with 3-5% and 10% difference objective values than the baseline one, for reference and small problems, respectively. The objective was the highest when scheduling many jobs on few machines, while running time was so when doing on many periods. Although their results were almost identical, closer inspection revealed that ANNS slightly outperformed MTS in terms of objectives and processing time.

Pareto Optimal Solutions to Biobjective Scheduling
Problems. It is notable from Figure 2 that at global optimum, the utilization of the first machine was greater than that of the other one. Moreover, their utilization in the first period was much greater than that in the third. To resolve the balance between production cost and resource utilization issue, this section describes and compares three Pareto-based approaches, proposed in Section 3.3. The evaluations reported as follows were performed on 12 problems, i.e., 06J02M03P and 06J02M06P from reference problems, eight uniformly generated cases per 05H02M06P and 06J02M09P from small problems, and 07J03M12P and 08J03M09P with a case each, also from small problems. Due to its superior performances, ANNS was chosen as an optimizer.

Pseudo-Pareto Optimal.
With the pseudo-Pareto strategy, a set (Z-List) of 10 production plans, whose Z values were minimized, was collected. Subsequently, their utilization metrics (U) were then evaluated and labelled as U-List . The statistical Z-score was then computed separately from means and standard deviation of the respective list. The optimal plan was the one that gave the best-averaged Z -score between Z and U lists. Scheduling based on optimal primary and secondary objectives as well as on their pseudo-Pareto optimal of two sample cases, i.e., 07J03M12P and 08J03M09P, is displayed in Figures 7(a) and 7(b).

Parallel Pareto Optimal.
With this strategy, both Z and U were evaluated in parallel, during which 2 sets of 10 best production plans were separately compiled for both Z-List and U-List. Once converged, the optimal one that gave the best-averaged Z-score between Z and U lists was selected. The resultant scheduling on the same problem sets and parameters is plotted in Figures 7(c) and 7(d).  This strategy was similar to the above strategies, except that after the primary objective was evaluated for each plan, it would then be evaluated by the secondary one. The plan with the best primary scores would only be stored in a candidate list, only if its secondary score was higher than an acceptable threshold. The resulted scheduling on the same problem sets and parameters is displayed in Figures 7(e) and 7(f).
In terms of efficiency for all twelve cases, the pseudo-Pareto strategy took the least computing time of 403 seconds on average (ranging between 79 and 1021 sec.), followed by serial and parallel of 536 seconds (ranging between 92 and 1820 sec) and 887 seconds (ranging between 164 and 2384 sec), respectively.
To assess the performance of the proposed strategies, Box-Whisker plots of resultant production costs in all twelve cases, based on production (left), utilization (middle), and biobjective (right) metrics, are illustrated in Figure 8. It is clearly seen that, when aiming at minimizing production cost (Z), the primary objective was generally low, while maximizing machine utilization (U) slightly raised it, but particularly in parallel cases, to greater extent (e.g., at almost 100%). On the other hand, scheduling with biobjective (Z and U) model resulted in balanced cost versus utilization, while maintaining relatively low overall costs. Nonetheless, marginal decrease in this balance is noticeable with the parallel Pareto strategy.
Similar analyses can be made regarding machine utilization. In Figure 9, Box-Whisker plots of resultant machine utilization in the same cases are plotted. It is clearly seen that, with utilization being optimized, their values tended to be a little higher than when focusing primarily on production cost. However, the biobjective model well balanced them, especially when taking pseudo-Pareto-based approach.
Further analyses on individual cases are reported in Figure 10. It revealed that, for small problems, both metrics remained similar regardless of an approach taken, except, however, as problems got larger, when the pseudo-and parallel Pareto-based approaches resulted in lower production cost and higher utilization, respectively. The graphs also suggest that the serial Pareto-based approach was the best compromise on these metrics.  Table 2, resulting in 60 instances being analyzed. Likewise, for large problems, a total of 15 cases, each with 6 production parameters, resulted in 90 instances being considered. Finally, on evaluating Pareto optimization, a total of 12 instances were drawn from references and small problems. Accordingly, there were in total 153 different scheduling instances involved in the above experiments.
Nonetheless, to elucidate that the number of instances was sufficient to draw conclusions, an experiment on additional cases, i.e., 06J04M10P, 06J04M15P, 09J04M15P, and 09J06M10P, each with 4 parameter settings, was performed by using ANN and MTS algorithms, in turn. Their resultant objectives were computed and then averaged over a given problem. The Box-Whisker plots comparing both algorithms and four cases are illustrated in Figure 11.
It is evident that regardless of manufacturing parameters and algorithms, each problem case exhibited similar objectives, i.e., as small as about 0.6-3.5% percent deviations. This implies that its performance is dependent only on the problem sizes but not variations over instances. The problem

Conclusions
In unrelated parallel machine (UPM) scheduling, the involved resources are prescribed with varying production conditions and constraints, e.g., processing and job sequence-dependent setup time and their eligibility. Theoretically, the optimal solution to such problems may be solved by integer programming, given a mathematical objective model, but with NP-Hard complexity. This approach is thus not practical in actual JIT manufacturing environments, except for preliminary assessment on very small problems. To remedy complexity issue, this paper proposed heuristic and metaheuristic approaches, called ANNS and MTS, to this scheduling problem. It was demonstrated in the experiments that both ANNS and MTS yielded slightly less optimal production cost than the integer programming of mathematical model by merely 3-10% on average, in  reference and small problems. Nonetheless, their computing performance was significantly better. They could complete all the problems well under the one-minute mark, whereas their counterpart would take days or, for larger problems, unable to compute at all. Moreover, we also demonstrated that primarily optimizing production cost that consisted of process time and setup time, and inventory cost, could inevitably result in unbalanced resource utilization. To resolve this issue, we formulated biobjective UPM scheduling model in finite integer domains. These objectives involved minimizing production cost and machine utilization under various prescribed resource and production conditions. The final decision was the policy, assigning a sequence of jobs to UPMs and associate production quantity for a given period. To this end, we proposed the Pareto-based approaches, i.e., pseudo, parallel, and serial ones, where both production cost and utilization metrics were considered. Among these variations, their differences were undiscernible for small problems. However, as problems got larger and more complex, the pseudo-Pareto-based approach appeared to be the best compromise between both metrics. In terms of complexity, parallel Pareto strategy needed the greatest time to compute, followed by serial and pseudo variants, respectively. Their final solutions were, however, indiscernible in the studied problems. In perspective, the developed metaheuristic schemes could be used to solve large problems. It was demonstrated in the experiments that optimizing small problems by ANNS and MTS was highly efficient. They gave solutions with as small deviations from the optimal ones as 3.69% and 4.51%, respectively, on average. As the problems became larger, both algorithms gave inferior solutions but by no greater than 5%, compared to those solved by a much time-consuming mathematical model. Specifically, depending on initial condition, the computation times were brought down from hundreds of hours to a matter of minutes, when both optimizers converged to similar outcomes. Furthermore, the proposed Pareto approaches to biobjective models allowed simultaneously minimizing both production cost and machine utilization. In practice, however, it may be preferable to maximize utilization, while keeping the cost low. To this end, adjustment to the proposed framework is trivial. However, further constraint on the utilizations being lower than say 80% for all periods is suggested, to prevent overload.
Unlike a previous work [73], which posed the problems on a single machine as a traveling salesman one (TSP), our work tackled them on multiple machines as vehicle routing problem (VRP). Compared with [74], where no setup time and stochastic process time were assumed, our model relieved these constraints and considered deterministic process time with SDS. As such, our model could be extended to precedence job shop scheduling, where conditions on previous and next jobs are specified. The developed model is applicable on both single and parallel machine environments. In the case of single machine, the setup time for a prohibited chain, e.g., from job j to k, on a machine, m, could be set to an extremely high cost so that it will be discarded during the optimization. However, this technique may fail, if the optimizer inserts a job in between, i.e., j, m, then k, in which case multilevel SDS may be needed. For parallel machine, since our method computes completion time for each job, order of a chain can be directed by setting its cost, based on the number of completed items for each job. For example, in a case where job k must succeed j, unless the number of items processed by job k is less than that by job j, then its cost will be set to some high value, to avoid being selected.
Other future directions worth focused on include investigation on soft-computing schemes such as convolutional neural network (CNN) and validation on more realistic conditions, e.g., taking into account unscheduled maintenance and ad hoc job insertion.

Data Availability
The scheduling problem in Microsoft Excel (.xlsx), results obtained from the mathematical model in LINGO (.lgr), and simulation results in MATLAB data (.mat) formats used to support the findings of this study are available from the corresponding author upon request. from the Institute of Engineering and Centre for Scientific   Figure 11: Box-Whisker plots comparing the objectives that resulted from ANN and MTS in four different cases. 16 Modelling and Simulation in Engineering and Technological Equipment (CSTE) for their contribution, critiques, advice, and supports in preparing and conducting this study.