Assembly Line Balancing Based on Beam Ant Colony Optimisation

We use a hybrid approach which executes ant colony algorithm in combination with beam search (ACO-BS) to solve the Simple Assembly Line Balancing Problem (SALBP). The objective is to minimise the number of workstations for a given fixed cycle time, in order to improve the solution quality and speed up the searching process. The results of 269 benchmark instances show that 95.54% of the problems can reach their optimal solutions within 360 CPU time seconds. In addition, we choose order strength and time variability as indicators to measure the complexity of the SALBP instances and then generate 27 instances with a total of 400 tasks (the problem size being much larger than that of the largest benchmarkinstance) randomly, with the order strength at 0.2, 0.6 and 0.9 three levels and the time variability at 5-15, 65-75, and 135-145 levels. However, the processing times are generated following a unimodal or a bimodal distribution. The comparison results with solutions obtained by priority rule show that ACO-BS makes significant improvements on the quality of the best solutions.


Introduction
An assembly line is a continuous production line consisting of materials and workstations combined with conveyor belts, and it can link men and machines closely and efficiently [1]. Assembly lines are flow-oriented systems that are indispensable for both the production of high quantity standardized products and low volume production of customized products [2]. Effective design and redesign of assembly lines require high investment and running costs and are essential in manufacturing industries [3]. In addition, assembly planning and control play important roles in managing expanding product ranges, reducing delivery time and costs, and increasing profitability [4].
The Assembly Line Balancing Problem (ALBP) is a wellstudied classic problem [5] and can be seen as a generalization of the Bin Packing problem where precedence constraints are added [6]. It focuses on assigning tasks to workstations with the aim of satisfying the precedence relationships among the tasks, the workload limitation of the workstations, and optimizing performance measures [7]. According to Becker and Scholl [8], there are four types of ALBP: SALBP-I aims to minimise the number of workstations with a given fixed cycle time; SALBP-II minimises the cycle time with a given number of workstations; SALBP-E aims to minimise the cycle time and the number of workstations at the same time by considering their relation with the total idle time or the inefficiency of the line; SALBP-F determines the feasibility of the problem with given the number of workstations and the cycle time.
ALBP is a well-known NP-hard problem, and it has been researched for more than sixty years. It was first studied by Salveson [5] who constructed a mathematical model of ALBP and suggested a solution procedure. For decades, the 2 Mathematical Problems in Engineering core problem has been extended to meet robotic, machining, and disassembly contexts, but even the simple version is still challenging [3].
Exact methods and approximate methods have been used to solve ALBP. According to Baybars [9], if n denotes the total number of tasks, there are n! possible sequences of tasks in SALBP; if there are r precedence constraints, then there are approximately n!/2r distinct, feasible sequences. Consequently, the required computational time for obtaining an optimal solution with an exact method for most of ALBP increases exponentially with the instance size considered [3]. This limits the performance of exact methodologies especially when the problem size is extremely large. Therefore, exploring efficient heuristic methodologies to cope with large scale ALBP within an acceptable time period is clearly necessary.
Recently, meta-heuristic algorithms such as the genetic algorithm, particle swarm optimisation and the ant colony optimisation algorithm have been used to deal with ALBP due to these algorithms' good performance on optimisation [1].
Swarm intelligence algorithms are based on the collective behavior in decentralized, self-organized systems and consist of agents interacting with each other and the environment. There is no centralized control structure. This kind of algorithms can be scalable since the number of agents can be easily added or removed. Besides, each agent is simple to design, and reliance on individual agents is small. Although each agent is not sophisticated, complex tasks can be solved in cooperation. As to ant colony algorithm, its main novel idea is the synergistic use of cooperation among many relatively simple agents which communicate by distributed memory implemented as pheromone deposited on edges of a graph [10]. The colony as a whole coordinates the activities without a direct communication between individual ants, as an isolated ant basically moves at random [11]. Each ant can build a solution step by step, and information left by other ants is used during the solution generation process. Good solutions can be obtained eventually without direct communication.
ACO has good performance on solving combinatorial optimisation problems. To effectively address the assembly line balancing problem with complicating factors such as parallel workstations, stochastic task durations, and mixed-models, McMullen and Tarasewich [12] proposed an approach based on ant techniques and, in comparison with other heuristics, showed that the proposed method is competitive with other heuristic methods in terms of the performance measures used in the study. Bautista and Pereira [13] used an ant algorithm incorporating some ideas that have offered good results with SALBP to solve the time and space constrained ALBP and get much better results than those by Tabu search. Kucukkoc and Zhang [14] and Zhong and Ai [1] also explored ALBP with ant colony based approaches. Therefore, ACO based methodologies show promising performance in coping with ALBP, and ACO is sufficiently flexible to be combined with other algorithms to achieve better performance.
With the development of products, the problem size is increasing and the complexity in the assembly process is greatly increasing to a large extent. Although many explorations have been undertaken by researchers, the development of methods to suit the complex assembly context is urgent, with the increasing of complexity of ALBP. In this study, we focus more on the performance of the algorithm on the large scale ALBP. In order to deal with ALBP in a large problem size, we hybridized ACO with Beam Search (ACO-BS) to improve the efficiency and improve the computational performance of the algorithm so that satisfactory results can be obtained within an acceptable computation time. The paper is organized as follows: Section 2 presents works solving ALBP by ant techniques; Section 3 shows the problem description with the mathematical model; the algorithm of ACO-BS is comprehensively developed in Section 4; the proposed algorithm is tested with benchmark instances at first and then the larger scale problems are generated to further explore the performance of the algorithm. The computational results are given in Section 5, and Section 6 gives contributions of the work and the future directions.

Literature Review
The ant colony algorithm has been applied to solve ALBP, and the traditional ant colony algorithms have been adapted to deal with the complex models of ALBP. Furthermore, researchers have also validated the effectiveness of the ant colony heuristic in solving ALBP. Baykasoglu and Dereli [15] integrated the computer method of sequencing operations for assembly lines, ranked the positional weight heuristic, and the ant colony heuristic to deal with the simple and U-shaped ALBPs. Additionally, Fattahi et al. [16] developed a heuristic approach based on the ant colony optimisation approach to solve the medium-and large-size scales of this problem, since the problem is NP-hard. The experimental results validate the effectiveness and the efficiency of the proposed algorithm.
SALBP, which belongs to a class of intensively studied combinatorial optimisation problems known to be NP-hard, has attracted the attention of researchers and practitioners of operations research for almost half a century [2]. With the development of ALBP, the core problem has been extended from a manual assembly background to robotic, machining, and disassembly contexts; thus there are various industrial environments and line configurations [3]. Researchers try to narrow down the gap between academic research and the reality faced by practitioners, with more constraints considered in order to explore more realistic problems. To effectively address the assembly line balancing problem with complicating factors such as parallel workstations, stochastic task durations, and mixed-models, McMullen and Tarasewich [12] proposed an approach based on ant techniques, and comparison with other heuristics showed that the proposed method is competitive with other heuristic methods in terms of the performance measures used in this study. Simaria and Vilarinho [11] presented an method to solve the twosided mixed-model ALBP with an ant colony optimisation algorithm, where two ants "work" simultaneously to build a balancing solution which verifies the precedence, zoning, capacity, side, and synchronism constraints. The superior performance of the approach was demonstrated Mathematical Problems in Engineering 3 by the results of a computational experiment. AkpıNar et al. [17] presented a hybrid algorithm combining an ant colony algorithm with a genetic algorithm for type I mixedmodel ALBP with features such as parallel workstations, zoning constraints and sequence dependent setup times between tasks. To carry out assembly sequence planning and assembly line balancing simultaneously, Lu and Yang [18] proposed an ant colony algorithm based on the searching mechanism and the pheromone updating mechanism. In addition, the assembly task time, the time for changing assembly directions, changing assembly tools, and the time for moving heavy parts in the workstation were also considered.
There are many situations in which multiple objectives are taken into account and these objectives are sometimes conflicting. Therefore, methods to solve multiobjective ALBP are valuable to guide practitioners. McMullen and Tarasewich [19] simultaneously addressed the objectives of crew size, system utilization, the probability of jobs being completed within a certain time frame, and system design costs, and the superiority of the modified ant colony optimisation technique was shown in comparative results. Zha and Yu [20] presented a new hybrid algorithm of ant colony optimisation and filtered beam search to solve the U-line rebalancing problem with two objectives. In the process of constructing a path, each ant explores several nodes for one step and chooses the best one by global and local evaluation at a given probability. The proposed algorithm was shown to be good at solving the U-line rebalancing problem. Kucukkoc and Zhang [14] introduced a type-E parallel twosided ALBP and proposed a new ant colony optimisation method with optimised parameters for solving the problem and found promising ways to simultaneously minimise two conflicting objectives, namely, cycle time and number of workstations.
Some researchers used one colony of ants to update the pheromone values and guide the searching process, while other researchers used multiple colonies of ants in the searching process so as to make the searching process more efficient. Multiple ants can be applied to the searching process in multiple objective problems. Agrawal and Tiwari [21] utilized collaborative ant colony optimisation, which maintained bilateral colonies of ants which independently identified the two sequences but utilized the information obtained by their collaboration to guide the future path in solving a balancing problem in mixed-model disassembly, and the effectiveness and robustness of the proposed approach were well demonstrated. Ozbakir et al. [22] studied parallel assembly lines with a novel multiplecolony ant algorithm, and the effective algorithm was examined with benchmark instances and compared with other algorithms.
In conclusion, there are many approaches related to ACO and ALBP, and developing approaches that can solve ALBP within an acceptable time were critical in real industrial applications, since ALBP is a NP-hard problem and ALBP of complex products brings new challenges. More advances in methods to solve ALBP are necessary to suit the dynamic and changing industrial environment.

Problem Description.
ALBP is about assigning tasks to workstations, while optimizing certain criterion and not violating a number of possible restrictions, and can be divided into Simple Assembly Line Balancing Problem (SALBP) and General Assembly Line Balancing Problem (GALBP) [9].
For SALBP, the cumulative constraints associated with the available work time at workstations, and the precedence constraints established by the order in which the tasks must be executed need to be taken into account [23]. Nevertheless, GALBP problems contain additional considerations, such as the restricted assignment of tasks [24], or the assignment in a block of certain tasks [25]. Large scale SALBP is considered in this study.
According to Baybars [9], there are five main assumptions in SALBP: (A-1) a task cannot be split among two or more stations, and all tasks must be processed.
(A-2) tasks cannot be processed in arbitrary sequences due to technological precedence requirements.
(A-3) all stations under consideration are equipped and manned to process any one of the tasks, and any task can be processed at any station.
(A-4) the task process times are independent of the station at which they are performed and of the preceding or following tasks.
(A-5) the assembly system is assumed to be designed for a unique model of a single product. The mathematical model of SALBP-I is as follows:

Mathematical Model
The objective function (1) minimises the total number of workstations; constraint (2) suggests that every task is assigned to one and only one workstation; workload constraints (3) and (4) imply that the total processing time of each workstation does not exceed the cycle time; constraint (5) ensures that all the precedence relations are satisfied.

Reversibility of ALBP.
One ALBP instance can transfer to its reverse version after all the precedence relationships are reversed. If V = { 1 , . . . , } is a solution for the reverse problem, then a solution for the original problem can be obtained by inverting the workstation orders of V . Thus, a solution for the original problem can be = { , . . . , 1 }. Following Bautista and Pereira [13], we solve the original problem and the reverse problem, respectively, and then choose a better solution from the solutions obtained. Before comparing the solutions of the two versions, solutions of the reverse problem are transferred to those of the original problem. There are two criteria for selecting the best solution: (1) number of workstations and (2) idle time in the last workstation. The second criterion is added due the fact that there are always large plateaus when only the first criterion is used, and more idle time in the last workstation means better resource utilization of the previous workstations. When there are several solutions with the same number of workstations, preference goes to the solution with more idle time in the last workstation, and then the solution will be chosen randomly if there are still ties after the above two steps.

The Algorithm of ACO-BS
For the classical ant colony algorithm, many ants search for solutions separately in one iteration. According to Dorigo et al. [28], there are three kinds of classical ant colony algorithm: ant system, ant-density, and ant-quantity, and for the latter two models, each ant lays pheromone at each step, while for ant system, ants lay pheromone after the end of the tour. Thus, pheromone values are updated by global information in the ant system model, while local information is used to update the pheromone values in the other two models. Not surprisingly, the results of ant system model are better since global information rather than local information is used to guide the solution searching process.
However, with the ant system model, although ants may start from different starting points, there will still be large amounts of repetition when the algorithm progresses step by step during the searching process. For SALBP-I, even if different tasks are chosen by different ants to assign the current workstation, there is still a large possibility that the task sets generated by some ants for one workstation are the same. When there is no rule to prevent this kind of repetition, the searching process will not be effective when compared to other methods. This motivates us to add rules to prevent the repetition of assignment for each workstation.
Beam search is an adaptation of the branch and bound method in which only certain nodes are evaluated, and only promising nodes are kept for further branching and the remaining nodes are pruned permanently [29]. Beam width and the number of extensions are two important parameters in beam search, which progresses level by level, and moves downward from the best promising nodes at each level. is defined to be the beam width [29]. Meanwhile, the running time of beam search is polynomial of the problem size; thus an efficient searching process must involve more searching constraints. The extension number allowed for each node can be restricted to further speed up the searching process.
Since the beam search algorithm progresses level by level and ALBP can be seen as task assignment workstation by workstation, the searching framework of the beam search can be easily applied to ALBP. Of course, repetition of task assignment for every workstation can be controlled when using the beam search structure. On the other hand, for each workstation, selection rule in the ant colony algorithm can be used for task selection, and the quality of solution in one iteration can be used in the next iteration to guide the searching direction.
The general logic of the ACO-BS algorithm is as follows: At first, due to the reversibility of SALBP-I, a better solution is chosen after solving the original problem and the reverse problem once respectively, following the priority rule. The chosen solution is used to initialize the best-so-far solution ( ). In addition, the results obtained by priority rule are compared with those obtained by ACO-BS to show the extent that ACO-BS improves the quality of solutions. Next, within a certain time, solutions to original problem and the reverse problem are obtained by using ACO-BS for each iteration.
The pheromone values are updated corresponding to the solutions obtained so as to guide the searching process of the next iteration. If there is a better solution than (determined by the abovementioned criteria in Section 3.3), is updated by a better one.

Priority Rule.
Before assigning tasks, the priority values of tasks are computed as follows: Starting from the first workstation, put all the tasks in a set and set the idle time to be . | | is the number of successors of task . Also, tasks whose predecessors have been assigned are put into the set . Figure 1 shows the flowchart for priority rule. The assignment is implemented by the following steps: Step 1. Determine the available tasks. Examine tasks in and put all tasks with processing time equal to the idle time into the available task set V , since saturating the time resource of a workstation is preferable in order to improve the utilization of resources. If V is empty, tasks with no predecessor and processing time less than the idle time are put into set V .
Step 2. If V is not empty, choose a task with the highest priority value from V (if there is more than one task with the highest priority value, choose one from them randomly), and go to Step 3. If the set is empty, go to Step 4.
Step 3. Delete the chosen task from , set and V to be , and the idle time will decrease by the processing time of the assigned task. Go to Step 1.
Step 4. Close the current workstation. If is not empty, open a new one and set the cycle time to be and then go to Step 1; if is empty, end the procedure.

ACO-BS Algorithm.
There are four steps in the ACO-BS algorithm. The first step is used to initialize the parameters, Step 1 (initialization). Generate one solution by using the priority rule described in Section 4.1 for the original problem and its reverse version. Then two criteria that are introduced in Section 3.3 are used to choose a better solution, which is used to initialize the best-so-far solution.
Additionally, the pheromone value is one important concept in the ant colony algorithm, and it is used to guide the searching process for good solutions. Let ( = 1, . . . , ; = 1, . . . ) be the pheromone value between task and workstation , and all the pheromone values are initialized to be 0.5.
Step 2 (generate solutions from the original problem and the reverse one respectively by BS). Unlike the priority rule which is used for task selection, the selection rule here makes use of the pheromone values and priority values of the tasks. Unlike the computation of priority values used in priority rule, the priority values used in ACO-BS are processed as follows [30] after they are computed using (6): where min = min 1≤ ≤ and max = max 1≤ ≤ .
When choosing tasks from the set V , the probability that task is chosen by workstation is used. Choose a task by maximizing the probability or by roulette-wheel method is determined randomly with the same probability.
is calculated by the summation rule as follows [Choose a task by maximizing the probability or by roulette-wheel method is determined 29]: Partial solutions are extended by a set of tasks assigned to one workstation. In order to better illustrate the procedure of the algorithm, we illustrate the situation for the first workstation at first, and then the next steps are given. At first, task assignment for the first workstation is explored for ext times, and the procedure is the similar as that by priority rule, but task selection rule here contains pheromone values and the priority values of tasks. Let par be the initial empty partial solution set and be the set that stores the task sets of the last workstation of all the partial solutions. For the first workstation, the two sets are the same. After each exploration, the task assignment for the first workstation, which is different to those in and its lower bound (will be described later) is less than | |, which is the number of workstations needed in the best-so-far solution, is put into par and , because the task assignment for the first workstation is also a partial solution.
After the assignment of the first workstation, there will be at most ext partial solutions in par . One partial solution is picked one time, and then the following steps are repeated until the partial solution is extend for ext times (Let denotes the workstation which is currently considered; = ): (S1) = 1; 1 = stores the task set for workstation .
(S2) implement task assignment for workstation , and get the task set 1 for the workstation.
(S3) extend the partial solution considered by the task set 1 for workstation . If the extended solution is a complete solution, go to Step 4, else go to Step 5.
(S4) put the extended partial solution to which stores the complete solutions.
(S5) If the lower bound (described below) of the workstation needed after the assignment for the partial solution is less than | | and 1 is different from all the factors in ext , the partial solution will be put into par .
There are criteria to select the partial solutions generated, and only partial solutions which will not lead to solutions worse than can be extended in the next round. When choosing extensions after filling one workstation, two criteria are used. First, let be the set of tasks not assigned according to one partial solution , and the lower bound on the workstations needed is as follows [2]: Partial solutions are ranked by increasing the lower bound defined above. If there are ties after ranking by the first criterion, our preference goes to partial solutions with less idle time in the last workstation (further ties are broken randomly). Finally, for each workstation considered, there will be min{ , | |} generated, and denotes the width of beam and | | denotes the number of partial solutions obtained.
This step ends where there is no partial solution that can be extended. The partial solution set is empty when it is about to open a new workstation, and the extended partial solution, which is the complete solution, is put into .
Step 3 (choose the iteration best solution and update pheromone values). Since in Step 2, if the lower bound for a partial solution is no less than | |, it is aborted. Thus, there may be a situation in which there is no solution obtained in Step 2. If so, the best-so-far solution is used to update the pheromone values.
If there is a solution obtained in Step 2, an iteration best solution is chosen with the two criteria introduced in Section 3.3.
is then used to update the pheromone values. Pheromone values between task and workstation ( = 1, . . . , | |; = 1, . . . , ) are updated. There are two updating processes, (1) pheromone evaporation, for each needs to be updated, and there is (1 − ) ⋅ left after evaporation. ∈ (0, 1] is the evaporation rate, assigned as 0.1 in this study. (2) increases when task is assigned to workstation in . When a pheromone value is too small, task tends never to be assigned to workstation ; when the value is too large, task tends always to be assigned to workstation . Consequently, the solution space is small, and this may lead to bad quality of the solutions generated. Thus, the pheromone values are restricted to the interval [ min , max ] to prevent stagnation [32], and min = 0.01 and max = 0.99. If a pheromone value is larger than max after updating, it will be replaced by max ; if the value is smaller than min after updating, it is replaced by min .
If is better than (by using criteria in Section 3.3), is updated by .
Step 4 (calculating the convergence value). After the initialization of the pheromone values in Step 1, the convergence value is 1. All the pheromone values are initialized to be 0.5 when the convergence value is less than 0.05. According to Kong et al. [33], pheromone reinitialization is an important strategy to avoid premature convergence by preventing the algorithm searching around a local optima continuously with low effectiveness. The convergence value is calculated as follows [30]:

Computational Results
The ACO-BS algorithm was implemented in MATLAB and was run on all the instances using an Intel Core i7-6700 (3.40 gigahertz) processor, with 32 gigabytes of available memory. The computation times spent on obtaining the best solutions and the standard deviations are reported, and all running time reported is given in CPU time seconds.

Results by ACO-BS.
In order to exhibit the superior performance of the algorithm developed in this paper, we tested the algorithm with benchmark instances (SALBP-I) published on https://assembly-line-balancing.de/. There are 269 benchmark instances of SALBP-I. Optimal solutions can be obtained for 170 instances by using the priority rule only. After ten runs of ACO-BS (360 CPU time seconds for each run, there are 87 more instances whose optimal solutions can be obtained by the ACO-BS algorithm. There are 12 instances whose optimal solutions cannot be found by ACO-BS ( = 10, = 20), however, when the time limit increases, the results are better. For example, for the instance Warnecke (with task number of 58 and cycle time of 60), the optimal result can be found, with the average solution found to be 27.7 (standard variation is 0.483); there are three runs in ten in which the optimal solution can be found, with the running times of 2931.790, 2687.077, and 3570.333. Of course, due the increased searching space, the results can be better when the width of the beam and the number of extensions increase. However, this will increase the running time. Table 1 show the results of the benchmark instances. For each instance, the given cycle time, best solution ever found, solution found by priority rule, the best solution found by ACO-BS, the difference between solution found by priority rule, and the best solution found by ACO-BS are reported. Besides, the average and standard variation of solutions found in ten runs and the running times are also reported. The running time here is the computational time to find the best solution by ACO-BS for the first time. We can see from Table 1 that the algorithm performs well in most instances, but there are some instances that are a little tricky (tricky instances here refer to those whose optimal results cannot be found or cannot be found in every run). Such tricky instances are marked in Table 1. We explore their characteristics in order to have a better understanding of the complex instances and lay the foundation for the generation of such tricky instances.

Comparative Results with ACO, Genetic Algorithm, and
Particle Swarm Algorithm. According to the results shown in Table 1, ACO-BS can achieve significantly better results than those obtained by the priority rule. However, further comparative experiments are needed to show the superiority of ACO-BS. The Ant Colony Optimization (ACO), which has similar framework with ACO-BS except the beam search part, Genetic Algorithm (GA) in Leu et al. [26] and Particle Swarm Optimization (PSO) in Dou et al. [27], are used to compare with the ACO-BS.
Since ACO-BS begins with a solution obtained by the priority rule, the other algorithms will also use the priority rule to get the initial solutions. Specifically, ACO will initialize the best-so-far solution by the priority rule, and the number of ants is set to be 20 to compare with ACO-BS which has the beam width of 20; GA with population size of 50 has initial solutions obtained by the priority rule and four heuristic methods in Leu et al. [26] (except the third one in [26]), and 10 initial solutions are obtained with these five methods applied to the original problem and the reverse one; PSO has 30 initial solutions, with 10 obtained the same way as the 10 initial solutions obtained in GA, and the other initial solutions are randomly generated. Thus, the best solutions found by ACO, GA, and PSO will not be worse than those found by the priority rule as well. The other parameters in GA and PSO are the same as those in the corresponding two published papers.
The numbers of instances that can reach optimal solutions by ACO, GA, and PSO are 33, 24, and 23, respectively. Figure 4 shows the comparative results between ACO, GA, and PSO on the 99 benchmark instances whose optimal solutions cannot be reached by the priority rule. On the x axis are the 99 instances, and on the y axis are the differences between the best solutions found in 10 runs within 360 CPU time seconds and the corresponding optimal solutions. For GA and PSO, the difference between the best solution found and the corresponding optimal one ranges from 0 to 2. However, the largest difference between the best solution found by ACO and the corresponding optimal solution is 1. What is more, there are significantly more points related to ACO falling on the x axis, which means that among the three algorithms considered, ACO has comparatively better ability in searching solutions for ALBP.
In order to highlight the integration of beam search to ACO, ACO is compared with ACO-BS. With the increase of beam width in ACO-BS from 20 to 100, the number of ants used in ACO is increased to be 100, and the running time limit is also set to be 720 CPU time seconds. The numbers of instances whose optimal solutions can be reached by ACO and ACO-BS are 33 and 87, respectively. The comparative result is shown in Figure 5. From the points falling on x axis, we can see that there are many instances, whose optimal solutions cannot be reached by ACO but they can be reached by ACO-BS. Thus, with the integration of beam search, ACO-BS achieves significantly better solutions than those obtained by ACO. Thus, ACO-BS improves a lot in solving ALBP, compared with ACO.
Therefore, ACO shows superiority compared with GA in Leu et al. [26] and PSO in Dou et al. [27], and ACO-BS improves ACO by integrating beam search in ACO. Note. * indicates that an optimal solution is found; T beside the average of solution indicates trickiness. Note. For instances of Scholl, only the statistical information of tricky instances with the smallest and largest cycle time is reported in order to show the tendency character of the problem; "Sum" in the second column is the sum of processing times; and "t min " and "t max " denote the minimum and maximum task time, respectively.

Results of Randomly Generated
Instances. According to Scholl [34], the following three indicators can be used to measure the complexity of the ALBP instances: Order Strength (OS): OS is defined as the number of arcs in the transitive closure of the precedence graph divided by ⋅ ( − 1)/2, that is, the maximal number of arcs in an acyclic graph with nodes. The middle values of OS seem to be harder than the low or high order strength values [35]. Nevertheless, when OS is 1 there is only one task sequence feasible; when OS is 0, SALBP-I becomes the bin packing problem, which is also NP-hard [34].
Time Variability (TV): TV is measure by max / min , which reflects the time structure of one instance. max and min denote the longest and shortest processing time, respectively. A smaller TV suggests a higher complexity.
Time Interval (TI): The interval is defined as [ min / , max / ], which indicate the relation between the cycle time and the processing times. Instances with a time interval that is small and near to the right border of [0, 1] is expected to be relatively complicated.
Thus, we consider OS, TV, and time interval to measure the complexity of instances. We can see from Table 2 that the minimum processing time tends to be quite small (usually 1, with 5 and 7 as larger values). The OS seems to be around 0.2, 0.6, and 0.8. The smallest TV is 7.571, while the largest one can be 277.2. Additionally, the ratio of the minimum processing time to cycle time tends to be less than 0.1, and the maximum processing time ranges from 0.5 to a figure that is close to 1. The ratio of mean processing time to cycle time varies from 0.157 to 0.445. The minimum variation of processing time is 8.205; however, the largest is approximately 5000 times the minimum one.
Finally, we choose OS and TV as the main measurements for the tricky level of instances, and set three levels of OS (0.2, 0.6, and 0.9) and three levels of TV (5-15, 65-75, and 135-145). We pay attention to the minimum processing time, and give priority to the instance with a smaller minimum processing time. As we want to explore the larger scale instance, we choose the problem size to be 400. As the tricky level and the problem size level are high, we enlarge the time limit of one run from 360 CPU time seconds to 720 CPU time seconds and the width of beam increases to be 100 and the number of extensions increases to 30.

Generation of Random
Instances. The random instances generation consists two parts: arc generation and task times generation. Arc generation: According to Otto et al. [36], the concept of stages allows for a direct manipulation of stages characteristics. Following Otto et al. [36] and Kolisch et al. [37], we use three steps to generate precedence arcs. Firstly, the average number of tasks per stage is selected, and then the number of tasks per stage is generated following a truncated normal distribution (that is, the number is generated following a normal distribution iteratively until the number is no less than 1). Next, each beginning node (nodes have no predecessor) is assigned one successor, and each other node is assigned one predecessor. After assignments for all the nodes, one successor is chosen randomly for those having no successor. Finally, the second step is repeated until the expected complexity is reached.
During the abovementioned procedure, the following aspects should be taken into account. First, there should be redundant arcs. According to Kolisch et al. [37], let = ( , ) be a network with node set and arc set , and an arc ( 0 , ) is called redundant if there are arcs ( 0 , 1 ),. . ., ( −1 , ) ∈ and ≥ 2. Second, predecessors and successors of nodes can only be chosen from the previous stage and the next stage, respectively. Last, tasks are always considered in the order of increasing order, and the added precedence relationships follow the topological rule.
Task times generation: Kilbridge and Wester [38] found that task times usually follow a unimodal or a bimodal distribution. Following Morrison et al. [35], processing times of tasks are generated randomly according to some prespecified normal distribution. Three kinds of task times are used: peak at the bottom: tasks times are drawn from a normal distribution with the mean centered around small times; peak in the middle: task times are drawn from a normal distribution with the mean of C/2; bimodal: task times are drawn from a combination of two normal distribution with means centered around small and large times.
Besides, task times are rounded to the next integer and possible rounding effects are compensated by setting the default cycle time to 1000, which is large enough to allow flexible time structures [36]. Tables 3(a) and 3(b) show the statistical description of the precedence graph and processing times of randomly generated instances, respectively. For the OS levels of 0.2 and 0.4, the number of stages is 40, while for the OS level of 0.9, the number of stages is 50. The number of stages is tuned by hand, and we find that when the number of stages is large, the number of iterations to add arcs so as to increase OS is less.

Results of Randomly Generated Instances.
As to processing times, there are three types of distribution to generate the processing times for each TV level and the three TV values for one TV level are close in order to control the impact of TV, if impacts of OS or distribution types of processing times are expected to be examined.
Surprisingly, the solutions found by PR are the same with those found by ACO. The column of difference refers to the differences between the solutions obtained by priority rule and those obtained by ACO-BS, and this can indicate the extent to which ACO-BS improves the quality of solutions. Figure 6 shows the comparative results of ACO and ACO-BS. For 11 instances, the solutions found by ACO are the same with those by ACO-BS, while for the other 16 instances, there We can see from Table 4 that the most significant tendency is that instances whose processing times follow the bimodal distribution are more difficult, consistent with Morrison et al. [35], since for all OS levels, there are 11 instances where there is no improvement after using ACO-BS with the time limit of 720 CPU time seconds, with solution quality of only one such kind of instance improved (OS level is 0.2, TV level is 135-145). Besides, there are two instances where there is no improvement on solution quality, with OS levels of 0.6 and 0.9 respectively and TV levels of 5-15 and 65-75, respectively. However, the similarity is that their processing times are generated following the distribution peak in the middle.
As to the standard deviation of the solutions obtained and improved by ACO-BS, it seems that the standard deviations for instances with processing times generated following the normal distribution (peaking at the bottom) tend to be 0. This implies that these kinds of instances are easiest to be solved.
Although the time limit to run ACO-BS is not quite large, some tendencies have already been shown. This can be useful to explore the characteristics of large scale instances.

Conclusions
A method based on the priority rule is used at first to generate the first best so far solution. After using this method once on the original problem and the reverse problem respectively, 63.20% of the total benchmark instances can reach the optimal results. Based on the best so far solution obtained by priority rule, ACO-BS searches for larger solution space in order to reach more optimal results. After ten runs (360 CPU time seconds for each run), 95.54% of the total benchmark instances can reach the optimal results. What is more, these results are better when increases in the width of the beam or the number of extensions are allowed, or by increasing the time limit for one run. We can conclude that the algorithm of ACO-BS is good in solving SALBP-I.
Premature convergence is a challenging problem for ACO, and several strategies are used to deal with this problem. First, the pheromone values are restricted to the interval of [0.01, 0.99] to prevent stagnation, so that there will not be too large pheromone values that some tasks tend to be assigned to the same workstation, and there will not be too small pheromone values that some tasks tend to avoid being assigned to a workstation. Thus, stagnation can be prevented [32]. Also, there is an evaporation process when update the pheromone values, so it is discouraged to assign one task to the same position. Second, when choosing task from the available task set, the probability that each task in the set is chosen will be calculated. But with equal probability, one task will be chosen from the set by maximizing the probability, or by the roulette-wheel method. Thus, in this way, tasks with a higher probability have a larger chance to be selected, but tasks with a lower probability still have chances to be selected. Third, the convergence value is calculated in every iteration. Since the pheromone values are initialized to be 0.5, the convergence value is 1 at the beginning. When all the pheromone values are close to 0.99 or 0.01, the convergence value will be close to zero. The pheromone values will be reinitialized to be 0.5 when the convergence value is less than 0.05, and this will prevent the stagnation [33]. Therefore, the strategies above always try to search for alternative solutions rather than staying at the stagnation state. The good results of ACO-BS compared with the results of ACO, GA, and PSO also demonstrate that the algorithm has the ability of jump out of local optimum and prevent premature convergence.
With the development of the manufacturing industry and the transformation from mass production to customization, assembly of complex products within an acceptable period has become urgent. Thus, we are concerned more about large scale ALBP in this study. In order to further examine the performance of the algorithm in more complicated instances, we generate large scale SALBP-I instances randomly and explore solutions for them with ACO-BS. OS and TV are chosen to measure the complexity of the random instances. Compared with solutions obtained by priority rule, there are significant improvements in the quality of the best solutions after applying ACO-BS, which shows that ACO-BS is efficient for small scale instances as well as large scale instances. Therefore, ACO-BS is a promising tool for solving SALBP-I, especially for those of complex products. We browse further improvements, based on the current ACO-BS, since there are a number of extensions for the assignment of the workstation considered and the algorithm can be greatly improved by parallel computing. When taking advantage of this parallel feature, the algorithm will be able to perform better.

Data Availability
All the benchmark instances (SALBP-I) used in this study can be found on https://assembly-line-balancing.de/.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.