A Hybrid Genetic Algorithm for Nurse Scheduling Problem considering the Fatigue Factor

Nowadays and due to the pandemic of COVID-19, nurses are working under the highest pressure benevolently all over the world. This urgent situation can cause more fatigue for nurses who are responsible for taking care of COVID-19 patients 24 hours a day. Therefore, nurse scheduling should be modified with respect to this new situation. The purpose of the present research is to propose a new mathematical model for Nurse Scheduling Problem (NSP) considering the fatigue factor. To solve the proposed model, a hybrid Genetic Algorithm (GA) has been developed to provide a nurse schedule for all three shifts of a day. To validate the proposed approach, a randomly generated problem has been solved. In addition, to show the applicability of the proposed approach in real situations, the model has been solved for a real case study, a department in one of the hospitals in Esfahan, Iran, where COVID-19 patients are hospitalized. Consequently, a nurse schedule for May has been provided applying the proposed model, and the results approve its superiority in comparison with the manual schedule that is currently used in the department. To the best of our knowledge, it is the first study in which the proposed model takes the fatigue of nurses into account and provides a schedule based on it.


Introduction
In working places such as hospitals where the services must be provided continuously, distribution of workforces within different shifts is required. Accordingly, different studies could be found within the literature which formulated the problem of scheduling nurses, known as the Nurse Scheduling Problem (NSP), from different aspects to provide a timetable for nurses in a hospital [1][2][3].
In NSP, nurses are assigned different shifts according to a set of constraints and requirements which are determined by the hospital. On one hand, optimal nurse scheduling has effects on reducing hospital costs, increasing nurse's job satisfaction, quality of care, and increasing the input budget of hospitals [4]. On the other hand, hospital work shifts could impose negative effects on involved nurses such as understaffing, heavy workload, and irregular workscheduling conditions. It adversely affects the service quality of healthcare operations and leads to less patient-nurse interaction and patient safety issues. Meanwhile, other negative impacts of shift works on nurses are important such as fatigue, obesity, sleep disorder, and a wide range of chronic diseases [5][6][7][8]. It seems that human factors should be combined with NSP if a more productive hospital is needed. In other words, studying NSP considering human factors could improve both the performance of nurses and productivity of hospitals significantly [9].
In addition to former issues in nurses' work shifts, the pandemic of the COVID-19 has aggravated the problem and put double pressure on nurses across the world. In this new situation, work shifts should be determined regarding response to patients' care requirements effectively and immediately and reduction of nurses' fatigue simultaneously.
Motivated by the aforementioned issues, and since, to the best of our knowledge, formulating an NSP considering fatigue factor has not studied yet, in this study, we have suggested a framework to distribute nurses in work shifts with the least fatigue alongside an attention to the additional pressure due to COVID-19 pandemic. To do so, a mathematical model is formulated for NSP in which fatigue of nurses has been taken into account. To solve the proposed model, a hybrid Genetic Algorithm (GA) has been developed, and real data from a department in one of the hospitals in Esfahan, Iran, where COVID-19 patients are hospitalized, has been used to provide a timetable for May. e remaining parts of the current paper are as follows. Firstly, related previous work on the area of NSP is investigated. After that, the problem is described and formulated, and then, the proposed GA is introduced. Following this, test problems are solved to approve the accuracy of the proposed model. Next, a real case study is considered, and a timetable for the department of COVID-19 patients is provided. Also, results are compared with the current timetable which is scheduled manually. Finally, the most important results along with suggestions for future research are presented in the conclusion section.

Literature Review
Taking preferences into account has been studied in NSP from different points of view. For instance, in the NSP study of Tsai and Li [10], Topaloglu and Selim [11], Abdollahi and Ansari [12], Wong et al. [13], Jafari et al. [14], and Legrain et al. [15], shift preferences were considered while in Vanhoucke and Maenhout [16] both shift preferences and day preferences were integrated simultaneously. Also, Liu et al. [17] considered NSP in hemodialysis service with a preference on roles and shifts. Further, Hamid et al. [9] proposed a multiobjective model for NSP considering the skill, preference, and compatibility of nurses.
In addition to preferences, some papers have added social/human factors into the NSPs. For example, in Farasat and Nikolaev [18], social structure effects into Nurse Scheduling Problem (NSP) were considered, and it was shown how the social structure of the working environment can affect the performance of nurses. In Zanda et al. [19], the maternity of nurses, type of their contracts (full time or part time), or nurses that benefit from special reductions of the workload for several reasons (e.g., because they have a close relative who suffers a serious illness) have been taken into account in providing a nurse schedule. Nahand et al. [20] proposed a multiobjective model for NSP considering human errors of nurses to determine optimal shift scheduling of nurses. Particularly, nurses' preference score, allocation costs, penalty cost of violating soft constraints, and human errors were all considered as objectives to be optimized, and a weighted-sum method was employed to solve the model. Wolbeck et al. [21] proposed a model for the NSP considering the timing and distribution of shift changes among nurses. In fact, the model incorporates a fair shift change penalization scheme, in which the type, timing, and distribution of shift changes among nurses are taken into account. In addition, information from previous periods was considered, using an individual penalty score to distribute the shift changes fairly among the nurses. e model was solved within the Gurobi software.
Assigning nurses to the operating room is the subject of another group of papers in the NSP literature. Lim et al. [22] presented two models for NSP. e first one assigns nurses to upcoming surgery cases while the second one assigns lunch breaks for nurses. A column generation algorithm and a two-phase swapping heuristic were developed to find feasible assignments in a fast manner. In another study, Di Martinelly and Meskens [23] proposed a biobjective model for NSP to assign nurses to operating rooms. ey have used the e-constraint method to solve the problem. Besides, Chiang et al. [24] presented a mathematical model which can provide an optimized schedule for nurses and operating rooms simultaneously. GAMS software was deployed to solve the proposed model. Also, categorizing patients into different groups and then assigning nurses to each group of them have been studied in NSPs as well. In particular, Heshmat et al. [25] studied a twostage approach where, at the first stage, patients are clustered, and then, at the second stage, nurses are assigned to each cluster. e model was solved by using CPLEX. Moreover, Sarkar et al. [26] proposed a framework to assign nurses to home patients and patients in the hospital according to different demands with respect to the patients' health status. A hybrid GA was applied to solve the problem.
Meanwhile, the possibility of exchanging nurses within different departments has also been investigated. Fügener et al. [27] proposed a mid-term model for NSP considering cross-training effects. Firstly, a framework to define and visualize cross-training policies was proposed. Next, a new cross-training policy was introduced where each unit trained one dedicated nurse for each other unit, and finally, by using g CPLEX for solving the model, a schedule was provided. He et al. [28] proposed a two-stage stochastic model for NSP considering understaffing risk control. In the proposed model, an initial base number of nurses within the department's budget were assigned to the department. However, a central pool of nurses was maintained from where extra nurse shifts can be transferred from the pool to cover the shortage in certain departments, or redundant nurse shifts can be transferred into the pool. e model was solved using CPLEX. Schoenfelder et al. [29] presented a model that incorporates two classes of quick-response decisions in hospitals' nurse scheduling, namely, adjustments to the unit assignments of cross-trained float nurses and transfers of patients between units and off-unit admissions. ey have examined the proposed model within different hospitals.
However, the main purpose of some papers is to develop heuristic/metaheuristic/simulation-based approaches for solving NSPs with the aim of improving the solution procedures from both quality and CPU time. In this case, hybrid Artificial Bee Colony (Awadallah et al. [30]), Particle Swarm Optimization (Wu et al. [31]), Sample Average Approximation method (Bagheri et al. [32]), Variable Neighborhood Search algorithm (Zheng et al. [33], Rahimian et al. [34]), Hybrid Harmony Search algorithm (Awadallah et al. [35]), a heuristic approach based on Simulated Annealing (Knust and Xie [36]), a combination of three algorithms, namely, Fix-and-Relax, Fix-and-Optimize, and Simulated Annealing (Turhan and Bilgen [37]), and a heuristic procedure based on column generation (Strandmark et al. [38]) could be found within the literature of NSP.
According to a comprehensive literature review, we did not find any research which formulated NSP considering fatigue factor. To fill this research gap and to decrease double pressure which has been imposed on nurses due to the spreading of COVID-19, in the current paper, we formulated the NSP by considering the fatigue factor for a real case study, a department in one of the hospitals in Esfahan, Iran, where COVID-19 patients are hospitalized, to provide a schedule for nurses of this department for May in all shifts.

Problem Description
Health organizations such as hospitals are responsible for providing required services in both normal and crisis situations [39][40][41]. In recent years, health management has received special attention among researchers since appropriate allocation and usage of resources are vital to provide high-quality services to patients [42]. Accordingly, researchers in conjunction with decision-makers have worked on health management (as a pivotal issue in many political and social debates) in different regions and countries across the world to select an optimal form for the usage of healthcare resources [43]. To do so, they have applied novel decision-making tools to allocate and productive usage of resources [44] since the quality of the provided services to patients is highly dependent on the available resources. Among required resources, human resources are the most important factor in providing services [45,46]. Because staff costs are a heavy part of each organization's cost, increasing the productivity and efficiency of human resources is very significant. One of the most practical ways to increase the productivity of this valuable resource is to combine the right distribution of human resources [47,48]. In other words, it is very important to determine the number of required nurses during each shift so that, depending on the volume of arrivals, the provision of services always remains at the desired level. ere are different types of shifts such as night shifts, last week shifts, two-part shifts, and on-call shifts. However, work shifts should be scheduled in such a way that imposes the lowest fatigue on nurses because long shifts have a negative efficacy on employee physiology. To do so, we propose a multiobjective model for the Nurse Scheduling Problem to optimize the work shifts of nurses, which includes three shifts in the morning, evening, and night. e first objective function is minimizing the total costs while the second objective function is minimizing the expected value of total fatigue for all nurses in all shifts. In fact, in this model, since fatigue is one of the main human factors which affects the quality of doing jobs and health circumstances of employees, the nurse fatigue factor has been investigated as a function with binary values according to break times which makes a virtual life for nurses. In simple words, to reduce nurses' fatigue during the shifts, some certain times are allocated for their break.
Some of the assumptions of the problem are as follows: (i) All nurses have identical skills (ii) e break times are discrete values (iii) Demand behavior is the random variable based on a specific distribution function (iv) Each nurse is only assigned one shift (v) e break time is more important compared with shift time

Model Formulation
Equations (1) and (2) are objective functions of the proposed model. e first objective function minimizes the total cost of nurses. e first sentence of this function relates to nurses' fatigue reduction function at period i. e second and third sentences of the first objective function calculate the costs related to decisions about the increase or reduction in the number of nurses in each shift. Finally, the fourth and fifth sentences of the first objective function determine the number of shortages and surplus of nurses at period i. Equation (2) is the second objective function and minimizes the expected value of total fatigue for all nurses in all shifts. Appendix A provides some complementary relations and explanations about these objective functions.
Also, constraints of the proposed model have been formulated as equations (3)-(7). Constraint (3) shows nursing attendance virtual time at period i. Constraint (4) determines a lower bound and upper bound for nurse level changes at period i. Constraint (5) assures that in each period solely either nurse employment or nurse dismissal could occur. Finally, constraints (6) and (7) define the domains of the decision variables.

Linearization.
e above-proposed model is nonlinear because of constraint (5). erefore, before proceeding to the next stage, constraint (5) should be substituted with identical linear equations. To do so, equations (8)-(10) should be used: x

Solution Approach
Genetic Algorithm (GA) is one of the first Evolutionary Algorithms in which selection, crossover, and mutation are its main operators. e idea behind the GA originates from the Darwinian eory of Evolutionary and, therefore, GA is categorized as a population-based algorithm. It can be said that the GA is a programming technique that uses genetic evolution as a pattern for solving problems. e GA algorithm starts with a random population. is population contains a set of solutions, which represent chromosomes of individuals. e next step is creating the second generation of the population based on selection processes, i.e., generation based on the selected characteristics by genetic operators. For each individual, a pair of parents is selected. In the selection stage, the most appropriate elements will be selected so that even the weakest elements have the chance to choose. is process prevents from nearing the local answer.
Generally, GA has a connectivity probability that is between 0.6 and 1, which indicates the probability of birth of a child, and the organisms combine with this possibility. e connection of two chromosomes creates the child, which is added to the next generation. is process is done to find the right candidates for the answer in the next generation. is process creates a new generation of chromosomes that are different compared with the previous generation. e whole process is repeated until the last stage [49].
In comparison with other optimization algorithms, GA has certain advantages. e most important one is that GA has a powerful ability to tackle complex problems and could be used for solving a wide variety of optimization problems with different objective functions (e.g., stationary or nonstationary objective functions, linear or nonlinear objective functions, etc.). Moreover, since "multiple off-springs in a population act like independent agents, the population (or any subgroup) can explore the search space in many directions simultaneously. is feature makes it ideal to parallelize the algorithms for implementation while different parameters and even different groups of encoded strings can be manipulated at the same time" [50].
Due to the aforementioned advantages, in this study, a hybrid GA algorithm is applied to solve the proposed NSP model. Figure 1 shows the steps of the GA algorithm. Also, Appendix B provides more detail about the applied GA algorithm.

Model Validation.
Prior to solving the model by using real data, to validate the accuracy of the proposed model, a test problem is generated randomly and a schedule for one week is provided. Table 1 includes parameters for the test problem. All computations have been done within MATLAB software on a laptop with Intel Core i5, 2.5 GHz, and 4 GB of RAM.
e number of nurses is assumed to be five, and there are three shifts per day. Each nurse's salary per hour is also assumed to be 10 units per hour. Table 2 shows the obtained schedule for a week for the test problem.
In Table 2, columns x + i and x − i show the required changes in the number of nurses so that the value in column D is affected by these changes in each row. Also, column F1 contains the imposed cots to the system by implementing the schedule whereas column F2 shows the fatigue of nurses.

Case Study.
After validating the proposed model, we performed the model in a real case, a specific department of a hospital in Esfahan, Iran, where COVID-19 patients are hospitalized. Our purpose was to compare the scheduling obtained using the proposed model with those currently used in the considered department which have been computed manually by a person in the human resource department of the case study. Due to the nature of constraints in the NSP of the case study, the person who is responsible for providing the schedule, firstly, arranges an initial schedule that distributes all nurses within the days of a month. After that, she/he tries to rearrange the scheduling in order to also meet the fatigue of nurses as well. Accordingly, administrators of the department noted that the manual approach could be very time-consuming. In addition, some nurses were enjoying generous rest hours while most nurses were under high pressure. ereby, administrators decided to automate the scheduling of this department. It led to the distribution of nurses within the shifts in a fair manner while the required time for providing a schedule was reduced significantly as well. e department has 14 nurses, and working days are partitioned into three shifts: 8: 00 am-4: 00 pm, 4: 00 pm-12: 00 pm (0: 00 am), and 12: 00 pm-8: 00 am. Each nurse's salary per hour is 160000 IRR. It is worth mentioning that 1 EUR is equal to almost 46,292 IRR. Based on the salary, costs of employment, dismissal, shortage and, surplus are as follows:`c 1 � 7 * 8 * nurse's salary per hour, c 2 � 22 * 8 * nurse's salary per hour,  e patient's arrival rate is based on the Poisson distribution and is λ 1 � 2, λ 2 � 3, and λ 3 � 3 in the first, second, and third shifts. e fatigue factor follows the Weibull distribution with parameters ß � 2 and θ � 500. Also, Generating initial answer • An initial answer will be generated randomly.

Calculating fitness function and forming Pareto set • By using equations 11 and 12 (Appendix B)
Generating new answer based on neighborhood structures • ere are four neighborhood structures

Parameters
Values c 1 , c 2 , c 3 , c 4 Uniform (1, 10) λ 1 Β 1 θ 20 Table 2: Schedule for the test problem. e schedule for May which has been prepared manually and the schedule that has been prepared by hybrid GA have been compared in Table 3. In Table 3, column D depicts the order of each shift in each day. For instance, 6 3 3 on the 1st day of May states that, on the first day of May, the number of required nurses for the first, second, and third shifts is six, three, and three, respectively. In addition, columns x + i and x − i depict the level of changes required for each day regarding the last day. For example, on the 25th of May, one nurse should be added for a second shift whereas one nurse should be eliminated in the first shift compared with the 24 th of May. erefore, the order of shifts on the 25th of May is 5 3 1 while on the 24 th of May the order was 6 2 1.
Column m presents the rest level at each shift. e value of 0 means that nurses will not be allowed to have further rest during their shift. In fact, all nurses have a break time generally in their shifts. If this usual break is the only break during the shift, m will give a value of 0. However, values of 1 and 2 for m state that nurses have one or two further breaks during their shift. Undoubtedly, the surplus break times reduce the fatigue of nurses dramatically.
Finally, columns F1 and F2 contain the value of costs and fatigue function in each day. It should be mentioned that F1 values have been divided into 1,000 for more convenience. Figure 2 compares costs of manual scheduling and scheduling calculated by hybrid GA algorithm for all 31 days in May.
As Figure 2 illustrates, the costs of the department during one month according to manual scheduling are lower than those of the hybrid GA algorithm. However, this difference is not very significant, and Figure 3 clearly approves this matter. As it can be seen in Figure 3, the total costs of manual scheduling are 2,212,461,000 IRR while the total costs of scheduling which have been prepared by the proposed model and using the hybrid GA algorithm are 2,214,349,000 IRR. erefore, the difference between costs is not higher than 2,000,000 IRR, and actually, this is not a considerable amount of money. Figure 4 compares fatigue function in manual scheduling and hybrid GA scheduling. Figure 4 clearly reveals that although the values of fatigue function in the second decade of May are almost identical in both hybrid GA scheduling and manual scheduling, the  In addition, the required total time for providing a timetable by hybrid GA is considerably lower than the required total time for providing a timetable manually. Figure 6 depicts this difference. Journal of Healthcare Engineering Figure 6 illustrates that scheduling of the department manually takes 66.67 minutes whereas the hybrid GA algorithm reduces the required time for scheduling dramatically so that it needs just 1.28 minutes for solving the proposed model and providing a timetable for the department.

Conclusion
Scheduling work shifts in many industry and service occupations directly affects the mental and physical health of employees. is matter can be more obvious in occupations like nursing where employees are usually working under high pressure and stress. erefore, it would be better to consider employees' satisfaction and convenience in providing scheduling so that work hours impose lower pressure on them.
During the last months and due to the pandemic of COVID-19, physicians, nurses, and other persons in hospitals across the world are working under the highest pressure benevolently to suppress this insidious disease. is urgent situation can cause more fatigue for these persons, especially for nurses who are caring for COVID-19 patients 24 hours a day. erefore, nurse scheduling should be modified with respect to this new situation.
In this paper, a mathematical model has been proposed to optimize the work shift of nurses considering human factors. To do so, according to experts' points of view, fatigue as the most important factor among human factors has been selected, and minimizing fatigue by creating different time intervals of rest during shift works has been addressed.
To solve the proposed model, a hybrid GA algorithm has been performed. In addition, to validate the efficiency of the proposed model, real data from the department of COVID-19 patients in one of the hospitals in Esfahan have been used where scheduling is currently determined manually. In fact, applying the proposed procedure of this paper, a timetable for all shifts of this department has been provided for May, and results have been compared with those currently used in the considered department and provided manually. Results approve that while the proposed approach imposes a little more cost to the department, it outperforms the current manual scheduling in both fatigue factor and time of providing a time table.
Apart from the lower required time for providing a timetable compared with the former manual approach, one of the main advantages of the proposed approach is eliminating personal tastes. It has been done because in the proposed approach, no one is involved in providing schedules, and the level of available resources is the only effective factor on the final schedule. To sum up, the proposed approach distributes nurses within the shifts in a fair manner considering the least fatigue while the required time for providing a schedule is significantly lower than the manual approach as well. However, another strength of our proposed approach is its generalizability. In other words, due to the structure of formulation, the proposed model could be used in either other healthcare departments or other continuous systems where a schedule for different shifts is required. In all of these settings, the model could distribute nurses/workforces within different shifts fairly considering least fatigue.
In future research, our model can be developed by considering other human factors such as rewards, motivation, and loyalty. Also, developing other heuristic and metaheuristic techniques would be an interesting direction for future research.

A. Complementary Relations
As mentioned in Section 3.3, the first objective function minimizes the total cost of nurses. e first sentence of this function relates to nurses' fatigue reduction function at period i which is calculated based on equation (A.1):  Journal of Healthcare Engineering On the other hand, when demand is less than the number of nurses (D ≤ x i ), the cost of the surplus could be computed as follows: It should be noted that x i is computed based on equation (A.4) and (A.5): Moreover, as mentioned in Section 3.3, the proposed model has a second objective function which minimizes the expected value of total fatigue for all nurses in all shifts. In the current objective function, there are some values that will be calculated using the following equations: Steps of the Hybrid GA Algorithm 1) Generating Initial Answer. Since there are four variables including x + i , x − i , m , D and three shifts, all feasible solutions will be revealed in the form of a 4 × 3 matrix. A feasible answer can be found as follows: (B.1) An initial answer will be generated randomly. For this purpose, one of the first or second rows is selected randomly. To fill the first column, a random number between [x min , x max ] is selected. If the selected number is greater than zero, we will write a zero in the next row while if the selected number is zero, a random number in the interval [x min , x max ] will be selected for another row.
e same procedure will be employed for the second and third columns as well. In the third row, a random number in the interval [x min , x max ] opts for the first to third columns respectively.
2) Fitness Function. Suppose f 1 � c 1 T i�1 x + i and f 2 � c 2 T i�1 x − i . e fitness function is calculated according to these functions and will put in the Pareto set.

3) Generating New Answer Based on Different Neighborhood
Structures.
e first neighborhood structure is as follows: Between the first and second rows, one of them is chosen randomly. Suppose that i represents the selected row. Now a random column like j is chosen randomly. If x ij + 1 ≤ x max , then x ij + 1 ⟶ x ij and the value of column j in the next row is zero. e second neighborhood structure: Between the first and second rows, one of them is chosen randomly. Suppose that i represents the selected row. Now, a random column like j is chosen randomly. If x ij − 1 ≥ x min , then x ij − 1 ⟶ x ij . e third neighborhood structure: In the third row, a column j is chosen randomly. If m j + 1 ≤ m max , then, m j + 1 ⟶ m j . e fourth neighborhood structure: In the third row, a column j is chosen randomly. If m j − 1 ≥ m min , then, m j − 1 ⟶ m j . e algorithm starts with the first neighborhood structure.
e answer is added to the Pareto set; therefore, the current neighborhood structure remained. Otherwise, go to the next neighborhoods' structure. It should be mentioned that once the newly generated answer is entered into the Pareto set, the algorithm starts with the first neighborhood structure again. In this situation, the algorithm replicates the search process for m times.

4) Evaluating the New Generated Answer.
In the evaluation stage of the new generated answer, one of the following states should occur: First state If f 1 , f 2 for the new answers, in all answers of the dominant set, are worse, or one of them is worse and another is equal, it is not allowed to enter the dominant answer set. Second state e new answer is better than k answers of the dominant answer set. So, the new answer is added to the dominant set, and k nondominant answers will be eliminated.
ird state e new answer is not better than any answers of the dominant answer set. Meanwhile, if one of the functions f 1 or f 2 is superior to all answers of the dominant answer set, it will enter into the Pareto set. However, other answers will remain in the dominant answer as well.

5) Replicating Algorithm
Steps 1 to 3 are replicated for m times.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.