A Recovery Model for Production Scheduling : Combination of Disruption Management and Internet of Things

It is difficult to generate the new schedule effectively for minimizing the negative impact when an unanticipated disruption occurs after a subset of tasks has been finished in production scheduling. In such cases, continuing with the original schedule may not be optimal or feasible. Based on disruption management and Internet of things (IoT), this study designs a real-time status analyzer to identify the disruption and propose a recovery model to deal with the disruption. The computational result proves that our algorithm is competitive with the existing heuristics. Furthermore, due to the tradeoff between all participators (mainly including customers, managers of production enterprise, and workers) involved in production scheduling, our model is more effective than the total rescheduling and right-shift rescheduling.


Introduction
Providing customers with satisfactory service is the central concern for production enterprises.However, given the complexity of processing environments, random or unanticipated events, including anomaly status of processing environment and machine breakdown, constantly occur in production scheduling.As a result, the original schedule is likely to be suboptimal and may not even be feasible.In this study, we use the term "disruption" to describe the reasons that trigger a rescheduling process.
Research on rescheduling is extensive [1,2].We differentiate our model from the existing findings in which the new schedule will be evaluated.The objective of most rescheduling research is to find an optimal schedule in the new environment with respect to the original objective function.In our model, we consider both the original objective function and the deviation from the original schedule.The reason for considering the deviation is that once the original schedule is generated, many preparations have already been made, such as allocating resources, distributing raw materials, and fixing customer delivery dates.Hence, if any change to the original schedule has a negative impact on these preparations, then the new schedule has to minimize the deviation [3].
In this paper, we combine disruption management and IoT to deal with the disruption.IoT is to realize the information sharing of global things from any place, which is able to monitor the process of production scheduling and identify the disruption quickly.Furthermore, disruption management aims to dynamically revise the original schedule and obtain a new schedule that reflects the constraints and objectives of the evolved environment [4].
The rest of the paper is organized as follows.Section 2 provides a brief review of the related work.Section 3 designs a real-time status analyzer based on IoT to identify the disruption.Section 4 constructs a recovery model combining quantitative model and prospect theory to measure the deviation, and Section 5 proposes an improved ant colony optimization (IACO) to obtain the new schedule quickly.Section 6 shows the validity of the demonstrated model and IACO.Finally, Section 7 provides conclusions and directions for future research.

Related Work
Determining how to handle disruption is a significant issue in the decision-making process.Disruption management is an example of rescheduling in real time, which is necessary in 2 Scientific Programming some cases because disruption is unanticipated and cannot be embedded in making the original schedule in advance.Through disruption management, a new schedule that minimizes the negative impact can be obtained.
Disruption management was first applied to the airline industry, in which flight disruptions often entail huge cost loss [5,6].The successful implementation of the idea led to a growing interest in applying disruption management to other fields.In the study of production scheduling, Lee and Yu [7] studied the parallel-machine scheduling problem under the disruptive environment by minimizing the sum of weighted completion times and provided a pseudopolynomial time algorithm to solve the problem.Tang and Zhang [8] investigated the disruption caused by machine breakdown and proposed the Lagrangian relaxation approach to deal with the disruption.Wang et al. [9] discussed parallel-machine scheduling problems with a deteriorating maintenance activity and provided a polynomial time algorithm to solve the total completion time minimization problem.Liu et al. [10] studied the disrupted single-machine scheduling problem.The deviation was measured by the completion times of jobs and quantum-inspired hybrid heuristics were presented.Khedlekar et al. [11] studied the flexible managerial decision policy for disruption in production system.They solved the disruption problem analytically to determine the production period before and after disruptions.Paul et al. [12] developed the recovery model and dynamic solution approach to deal with disruptions for production-inventory system.Sarker et al. [13] focused on production scheduling under uncertain disruption and adopted an improved memetic algorithm to solve the problem.
Another stream of literature relevant to our research is the recently emerging subject of behavioral operation management [14].The operation contexts primarily investigated were supply chain management, newsvendor problem, forecasting, and decision-making, which is concerned with the theories of individual decision-making biases, social preferences, and cultural elements [15].Su [16] proposed a decision framework of bounded rationality applied to the classic newsvendor model.Gino and Pisano [17] explored the theoretical and practical implications of incorporating behavioral and cognitive factors into models of operations.Chen et al. [18] presented a model of bounded rationality based on the quantal response equilibrium, in which players were not perfect optimizers and faced uncertainty in the actions of their opponents.Katsikopoulos and Gigerenzer [19] briefly discussed some research on the heuristics in behavioral operation management and showed how the study of quantitative models of heuristics can complement it.Chen and Zhao [20] developed a behavioral model with different mental weights on the underage and overage costs to characterize the perception bias of a retailer on a critical fractile.
In summary, identifying the disruption was likely to be neglected in previous literature because the researchers focused on obtaining the revising schedule quickly.In addition, the existing researches assumed that humans in production scheduling were perfectly rational and emphasized the optimization of material and financial resources.They disregarded these facts that people facing a new situation often have different perceptions and the obtained solution may not be optimal or feasible.Accordingly, how to identify the disruption and measure the deviation still needs further study.

A Real-Time Status Analyzer Based on IoT
In production scheduling, the ambient condition subtly changes all the time.However, considering and adjusting it as a whole are worthy only when the current schedule does not work properly.Thus, a real-time status analyzer based on IoT is designed and developed to identify the disruption before adopting the recovery model.

System
Architecture.The status data are sent to computer terminals (CTs) of the Internet via wireless sensor networks (WSNs) and satellite.The data processing center analyzes the data, and it is alerted when a disruption occurs.The system can achieve the following three functions.
(1) Data Acquisition.Jobs in the shop are frequently sensitive to temperature, humidity, and pressure.Thus, the status of processing environment needs to be monitored in real time.All data of the basic properties and environmental status of jobs are also sent to the radio frequency identification (RFID) reader via the WSNs.Finally, the aforementioned information is summarized and sent to the corresponding CTs.
(2) Exception Analysis.Based on the initial schedule and according to different characteristics of jobs, the corresponding thresholds for equipment status, shop temperature, humidity, and pressure are preset.Thus, the massive timevarying data can be monitored and analyzed in real time to determine whether any exception occurs.
(3) Real-Time Alert.When an exception occurs as revealed by the monitored data, the disruption occurs and the system gives out alerts immediately, thereby revising the original schedule and obtaining a new schedule that minimizes the negative impact.

Redundant Data Processing.
RFID is prone to be absorbed and disrupted by the ambient condition, thereby resulting in any omission or deficiency in the reading process of the reader.This study aims to avoid such problems by proposing increasing the number of readers or the readwrite frequency, thereby ensuring data integrity.However, this proposal can inevitably produce a significant amount of redundant data, increasing the burden to the RFID data transmission and storage.Thus, the system efficiency is greatly reduced.Therefore, an effective way to solve problems of redundant data processing must be determined to screen all data and to reduce the burden on the system.
Basic sorted-neighborhood method (SNM) is an efficient algorithm to cope with redundant data.However, it is difficult to control the size of smoothing window.This study presents a modified SNM algorithm with specific procedures as follows.
Step 1.The data are sorted according to the timestamp.
Step 2. In the data comparison process, a smoothing window featured with adaptive adjustment is used, and the window is resized as desired.
Step 3. The first data entry in the current window is compared with the next, and whether it is repeated is determined.If it is repeated, Step 4 follows; otherwise, Step 5 is next.
Step 4. When the repeated entry is found, the entry with the oldest timestamp is deleted and that with the latest timestamp is retained.
Step 5. Whether the first entry in the window is the last is determined.If it is the last, Step 6 follows; otherwise, Step 3 is next.
Step 6.Whether any unprocessed data exists is determined.If so, Step 1 follows; otherwise, the entire workflow ends.

Recovery Model for Production Scheduling
Considering its wide application in production and manufacturing industries, the Job-Shop Scheduling Problem (JSP) is chosen as an example to construct the recovery model.

Problem Definition.
The problem is confined to the following conditions.Given  jobs, each job has  operations and must be processed on  machines.The objective is to find a schedule of minimal time to complete all jobs, where (1) the machine can only process one job at a time, (2) every job is available for processing at time 0, (3) once processing is initiated, the operation must be completed on the machine without interruption.

Notations
: the number of jobs.: the number of machines.
: the completion time of job  in machine .
: the processing time of job  in machine .  : the due date of V  .
: a large positive number.
In this model, the objective function (2) aims to minimize the makespan, which is the minimal time to complete all jobs.Formulas (3) and ( 4) are the technological and processing constraints.Formula ( 5) ensures that all jobs are completed before their due date.

Construction of the Recovery Model.
When the optimal schedule is executed, the disruption occurs and is identified by the real-time status analyzer.Consequently, we need to construct the recovery model and obtain a new schedule.
Machine breakdown is the typical disruption in production scheduling.In this paper, the recovery model is illustrated by the example of machine breakdown.

Problem Hypothesis
(1) The original schedule is known.
(2) The time when disruption ends is regarded as 0.
(3) The processing job when disruption occurs must be reprocessed after the disruption ends.In other words, jobs are uninterruptible.

Notations
: the number of unprocessed jobs.
: a subset of unprocessed jobs, 1  : the set of operations processed before operation  in the original schedule for machine .
2   : the set of operations processed after operation  in the recovery schedule for machine .
: the cardinality of subset   .
0  : completion time of V  in the original schedule.  : completion time of V  in the recovery schedule.
0 : the makespan in the original schedule.
: the makespan in the recovery schedule.
: the sequence deviation between the original schedule and the recovery schedule.
1 : unsatisfaction degree of customer  for completion time deviation. 2 : unsatisfaction degree of managers for operational cost deviation. 3 : unsatisfaction degree of workers for sequence deviation. 1 ,  1 : parameters related to gains and losses for customers. 2 ,  2 : parameters related to gains and losses for managers. 3 ,  3 : parameters related to gains and losses for workers. 1 ,  2 ,  3 : loss aversion coefficient for customers, managers, and workers, respectively.
The other notations are the same as the ones in previous sections.

Impact of Disruption.
The general process of one schedule is as follows: (i) managers of production enterprises generate an optimal schedule according to the requirements of customers, (ii) workers execute the schedule to process jobs, and then (iii) customers receive their products within the required time.The impact of disruption on the above three participators (i.e., managers, workers, and customers) is analyzed below.
(1) Managers.After disruption occurs, the processing sequence of unprocessed jobs will change, leading to the increase of makespan.The makespan is relevant to the operational cost, which is the central concern of managers.Accordingly, minimizing the makespan of unprocessed jobs is the key objective of managers when generating the recovery schedule.
(2) Workers.The processing sequence of unprocessed jobs will be changed after disruption occurs.Considering that many preparations have been made in advance based on the original schedule, workers have to reallocate raw materials and tools, which will lead to the increase of extra workload.Therefore, the recovery schedule should minimize the sequence deviation to reduce the negative impact on workers.
(3) Customers.Once the environment is interrupted by disruption, a chain reaction will be generated to affect the completion time of unprocessed jobs, and a number of customers will not be served within the required time.Thus, the recovery schedule should reduce the deviation of completion time to improve the satisfaction and loyalty of customers.[21] is one of the most important decision-making theories.It can perfectly describe the decision-making based on bounded rationality under the uncertainty condition.Hence, on the basis of prospect theory, this paper measures the deviation of participators [22].

Function of Measuring Deviation. Prospect theory
(1) Measuring the Deviation of Customers.The value function of customer  can be described as follows: where the reference point is  0  ( 0  = max 1≤≤   ).  >  0  means losses ( < 0) for customer ; otherwise, it means gains ( ≥ 0) for customer .
The unsatisfied membership function of customer  for completion time deviation can be given as follows: where (2) Measuring the Deviation of Managers.The value function of managers can be constructed as follows: where the reference point is  0 . >  0 means losses ( < 0) for managers; otherwise, it means gains ( ≥ 0) for managers.
The unsatisfied membership function of managers for operational cost deviation can be given as follows: where (3) Measuring the Deviation of Workers.The value function of workers can be constructed as follows: where the reference point is 0 because the sequence deviation does not exist in the original schedule.In recovery schedule,  is equal to ∑  =1 ∑  =1   . > 0 means losses ( < 0) for workers.
The unsatisfied membership function of workers for sequence deviation can be given as follows: where  3 = (1/ 3 ) 1/ 3 .

Recovery Model.
On the basis of the above functions of measuring deviation, the objective function of the recovery model is constructed through lexicographic multiple goal programming as follows: Formula ( 12) aims to minimize the sum of the unsatisfaction degrees of customers, managers, and workers. 1 ,  2 , and  3 represent the different priorities, which can be flexibly adjusted in the practical application.

IACO for the Recovery Model
The proposed model is NP-hard, which has an optimal solution that is difficult to obtain.Ant colony optimization (ACO) is an evolutionary computation technique developed by Colorni et al. [23].Compared with other heuristics, ACO has the characteristics of positive feedback, distributed computation, and constructive greedy heuristic.In particular, the ACO has been shown to be an efficient algorithm for solving NP-hard problems [24][25][26].However, the algorithm still has the weaknesses of local optimization and low search speed.In this study, we illustrate IACO to solve the recovery model.

Introduction of Adjusting the Pheromone Trail
(1) In ACO, pheromone trails left by ants do not always show the evolutionary direction, and the pheromone deviating from the optimal solution has the probability of increasing, which leads to premature convergence.This study proposes an approach to enhance the global search capability of ACO by adjusting the pheromone trail adaptively with the evolutionary process.
(2) The updated pheromone trail may reach the maximum value  max or minimum value  min after every search process is completed. max leads to premature convergence, and  min increases the computational time.Thus, IACO limits the pheromone trail   in the interval ( min ,  max ).After the pheromone trails are updated,   is replaced by  max when   >  max or by ( min +  max )/2 when   <  min .
(3) ACO has difficulty obtaining the optimal solution because the trail persistence  is fixed.The smaller the value of  is, the better the global search capability will be.However, the computational time will increase gradually.On the contrary, increasing  can improve the computational time, but the algorithm can easily be trapped in the local optimal solution.Therefore, IACO adopts a dynamic  rather than a fixed value.

Approach of the Crossover and Mutation.
The crossover and mutation operations can increase the variance of the population and search the solution space completely in genetic algorithm (GA).Thus, these two operations are adopted in IACO to avoid premature convergence.
(1) Crossover Operation.In the paper, a schedule  is represented by the following encoding structure: where   represents the sequence of operations of job  = 1, 2, . . ., .Each solu in   is an unrepeated integer number in the interval [1, 𝑤].
When the search becomes trapped in the local optimal solution, crossover operation is conducted as follows: (iii) Update the encoding of optimal solution.
(2) Mutation Operation.Mutation plays a significant part in improving the diversity of the population.Similar to GA, IACO is designed to avoid premature convergence and obtain a better solution by significantly decreasing pheromone trails in any path of the local optimization routing.These paths will be selected by small random probabilities because the distribution of the pheromone trails will be destroyed by too many mutation operations.

Combination with Other
Heuristics.ACO has the characteristic of strong coupling, which makes it easy to combine with other heuristics.Exchange mechanism is originally applied in the vehicle routing problem with time windows.
It can obtain a new solution from the current solution by exchanging the nodes.We demonstrate a neighborhood exchange mechanism to significantly improve the convergence speed of ACO.
(1) Operators of the Neighborhood Exchange Mechanism.Two neighborhood operators (among those listed below) are included in this study.
(i) Random Swaps of Subsequences.This operator randomly selects  and  with  ̸ =  and swaps   and   in the solution sequence.Figure 1 shows an example of this operator.(ii) Random Insertions of Subsequences.This operator randomly selects  and  with  ̸ =  and puts   in front of   .An example is shown in Figure 2. (2) Rules of the Neighborhood Exchange Mechanism (i) To avoid generating too many unavailable solutions, the operations in the same machine can be exchanged.
(ii) To avoid destroying the distribution of the previous pheromone trails, one neighborhood operator is chosen randomly from the above operators and applied once to the current solution.

Numerical Examples
In this section, we illustrate computational experiments to validate the effectiveness of IACO and recovery model presented in the previous sections.

Computational Experiment for IACO
(1) Computational Results.The IACO is tested using the classical sets of JSP, which are TA [27] and DMU [28].
For each instance, the proposed algorithm is independently executed 10 times to compute the average value.We then conduct a performance comparison between IACO and other heuristics, including -TSAB [29], GES [30], and AlgFix [31].
The results are illustrated in Table 1.
(2) Comparison among Different Heuristics.The comparison obtained from the above results can be summarized as follows: (i) Compared with -TSAB, IACO obtains better or close solutions in 8 out of 10 problems (80%).
The comparison indicates that the IACO is competitive with the existing heuristics.Moreover, it can improve the best solutions known for a number of examples, especially for large size problems, such as DMU61 and DMU77.It is noted that those parameters affect the overall performance of IACO, so the solution obtained by our algorithm would be improved further if every parameter is set more reasonably.

Computational Experiment for the Recovery Model
(1) Case Description.In the given job-shop,  = 6 and  = 6.The parameters of each job are shown in Table 2. Figure 3 illustrates the original schedule where the makespan is 55 h.
(2) Computational Results.After 10 h, the disruption suddenly occurs and the duration time is 3 h.The remaining jobs, including the processing job when disruption occurs, must be reprocessed after the disruption ends.
Following Tversky and Kahneman [32], we set  = 0.88 and  = 2.25.By adopting our model, total rescheduling, and right-shift rescheduling [33], we obtain the results shown in Figures 4, 5, and 6, respectively.Table 3 shows the results from different approaches.
(3) Comparison among Different Approaches.The results obtained from Table 3 can be summarized as follows: (i) From the aspect of the deviation of customers, the result of our model is much better than the results of the other two approaches.In other words, our model plays an obvious role in reducing the unsatisfaction degree of customers.(ii) From the aspect of the deviation of managers, our model, total rescheduling, and right-shift rescheduling obtain the same result.Therefore, our model is not worse than the other two approaches in reducing operational cost.
(iii) From the aspect of the deviation of workers, rightshift rescheduling obtains the best result based on the original schedule.The results of our model and total rescheduling are relatively poor.
In summary, compared with total rescheduling and rightshift rescheduling, our model increases the deviation of workers.By contrast, our model decreases the deviation of customers and thus significantly improves customer satisfaction.Furthermore, the production enterprise is subject to improving the loyalty of customers to expand the influence and attract more new customers.Such improvement helps enhance the potential profit and promote the development of enterprise in the long run.Therefore, our model is more reasonable and valid than the other two approaches.

Conclusions
This study designs a real-time status analyzer to identify the disruption and provides a recovery model to measure the deviation in production scheduling by combining disruption management with IoT.It aims to effectively identify the disruption and obtain a new schedule to minimize the negative impact on the participators (customers, managers, and workers).Meanwhile, IACO is presented to solve the above model.
To validate the effectiveness of our model in handling the disruption, we use an example to test different approaches.The computational result proves that our model is competitive with existing rescheduling and can obtain a relatively satisfactory schedule.

Figure 4 :
Figure 4: Recovery schedule of our model.

Table 2 :
Parameters of each job.Note: M:T means that the process time of the job's operation is  hours in machine .

Table 3 :
Results from different approaches.