A Balanced Heuristic Mechanism for Multirobot Task Allocation of Intelligent Warehouses

This paper presents a newmechanism for the multirobot task allocation problem in intelligent warehouses, where a team of mobile robots are expected to efficiently transport a number of given objects.Wemodel the systemwith unknown task cost and the objective is twofold, that is, equally allocating the workload as well as minimizing the travel cost. A balanced heuristic mechanism (BHM) is proposed to achieve this goal. We raised two improved task allocation methods by applying this mechanism to the auction and clustering strategies, respectively.The results of simulated experiments demonstrate the success of the proposed approach regarding increasing the utilization of the robots as well as the efficiency of the whole warehouse system (by 5∼15%). In addition, the influence of the coefficient α in the BHM is well-studied. Typically, this coefficient is set between 0.7∼0.9 to achieve good system performance.


Introduction
Autonomous guided vehicles (AGVs) have been used to perform tasks in warehouses for more than fifty years [1,2].In the past few years, they have been mostly used to transport large or heavy objects such as rolls of uncut paper or engine blocks.Recently, as autonomous robots [3][4][5] become cheaper, smaller, and more capable, they are widely used in the logistic industry, for example, being substituted for the manual labor in the typical pick-pack-and-ship process in warehouses [6].Moreover, in [7,8], it is pointed out that the multirobot coordination, if being really efficient and robust, will put significant impact on improving the logistical efficiency.The multirobot coordination involves the allocation and execution of individual tasks with an efficient mechanism.In this paper, we mainly focus on the task allocation process in which a team of autonomous robots must fulfill a set of orders in optimal routes meeting with certain criteria.This problem is also known as multirobot task allocation problem (MRTA) [9].
There are already some methods to deal with MRTA.In 1998, Yamauchi presented a greedy strategy [10] to coordinate robots, with which each robot chooses the closest task as soon as they finish their previous tasks.Gerkey and Matarić introduced a dynamic task allocation method [11] for groups of autonomous robots, with which the robots bid on all unallocated targets, and the bids depend on the distance between their last targets and those unallocated targets.The algorithm proposed by Sandholm [12], known as the combinatorial auction method, is considered more efficient because it divides tasks into different clusters according to correlations between tasks; then robots bid on clusters of tasks [13].In 2004, Lagoudakis et al. used Prim Allocation [14] to generate the MSF (minimum spanning forest) of tasks for robots and then transformed the forest in paths by the depthfirst traversal.Later, some researchers [15,16] improved the depth-first traversal process in the prim allocation method and better results are obtained.
As a matter of fact, most of current attempts focus on minimizing the travel cost of the robots group without concerning the travel time, that is, balancing the individual travel cost of each robot.There are few research works putting emphasis on improving the utilization of multirobots, especially in many time sensitive applications, such as the logistics industry, rescue, and exploration environments.Puig et al. [17] improved Yamauchi's greedy algorithm to solve the balancing-target allocation problem in the planetary exploration.By this method, after finishing its current task, every robot moves to the frontier cell until all the tasks have been explored.However, since every robot only acts for itself with no coordination with other robots, this method may lead to a rather suboptimal solution in travel distance, which is not acceptable in our logistic application.In 2011, Elango et al. [18] proposed the -means clustering and auction based mechanism (referred as CAB mechanism) to balance the task allocation in multirobot systems.This approach considers the dual goal of both minimizing the travel distance and efficient sharing of the workload.Nevertheless, the complexity of this algorithm is relatively high and we will compare this method with our proposed approaches later.
Considering the specific characteristics of our problem, (i) balancing the task allocation as well as minimizing the travel distance, (ii) the cost of tasks are unknown, and (iii) large scale and dynamic environment, a balanced heuristic mechanism (BHM) is proposed to coordinate the autonomous robots and is applied to several traditional approaches, such as the auction allocation method and the means clustering method.To evaluate the performance of different strategies, three criteria are provided as the travel time (TT), coefficient of variation (CV) between different robots, and total travel cost (TTC).Simulation results strongly prove that the BHM has a better balancing performance compared with traditional methods.
The rest of this paper is organized as follows.In Section 2, we give the formal description and model of the order fulfillment process in the warehouse.In Section 3, we describe the overall control structure of the system and present the balanced heuristic mechanism (BHM).Then simulated experimental results are provided and compared with several traditional methods in Section 4.

Problem Statement.
In this paper, we focus on the dynamic task allocation problem for multirobots in an intelligent warehouse.To characterize the system, some assumptions are listed as follows: (i) the system is composed of large-scale physical embedded robots; (ii) the robots and the tasks are homogenous; (iii) the environment of the system is known to each robot; (iv) each robot in the system is self-interested and fulfills its own tasks independently; (v) the robots may conflict with each other when fulfilling their tasks and they communicate with each other when being close to other vehicles (in purpose of obstacle avoidance); (vi) the time consuming at the station is the same for all robots.
Usually, the multirobot task allocation is described as follows: given a sequence of tasks and a set of robots, the goal is to assign tasks to robots in an efficient manner based on what the system is trying to optimize.In our approach, we strive to optimize the system in a dual objective function to keep the total travel cost as low as possible and keep the individual travel time as equal as possible, respectively.These two objectives correspond to a balance between time expenses and robot expenses, both of which deserve consideration in a concrete physical system.As shown in Figure 1, each storage shelf consists of several inventory pods and each pod consists of several resources.By order, a robot lifts and carries a pod at a time along a preplanned path, deliver it to the specific station which is appointed in the order and finally returns the pod back.The main consideration is how to allocate the tasks to the robots efficiently and equally while avoiding conflicts.
Another point that must be taken into account is that traditional task allocation methods are based on the assumption that the cost of each task is already known.However, in this problem, the input to the warehouse system is a sequence of orders, from which we can only know the location of the task and the specific station to which the inventory pod is supposed to be delivered.The cost of each task as well as the travel cost from one location to another is totally unknown.

Modelling.
The model formulated to solve the balanced multirobot task allocation problem mentioned above in the warehouse system is as follows (as shown in Figure 2).
Let a set of robots  =  1 ,  2 , . . .,   to complete a set of tasks  =  1 ,  2 , . . .,   .The cost of task   is   , which refers to the travel cost of carrying the inventory pod to the specific station and then delivering it back, ignoring the time cost of obstacle avoidance, and   (,  ∈  ∪ ) is the travel cost between every two locations (usually from the robot to the inventory pod assigned in the order).Suppose   have  tasks  =  1 ,  2 , . . .,   ; then the total cost of all the assigned tasks in   is expressed as (  ) =  1 +  2 + ⋅ ⋅ ⋅ +   .More details about the basic parameters and indexes using in the model are shown in Figure 2.
Rewrite  as Γ =  1 ,  2 , . . .,   , which means a partition of the set of tasks where task set   are allocated to robot   .Then, define the individual travel cost as ITC(  ,   ), which represents the sum cost for   to fulfill its task set   and can be calculated by   where    , 1 represents the travel cost for   to come to its first task  1 , and ∑ −1 =1    , (+1) represents the total travel cost for the rest  − 1 tasks.Also, (  ) represents the total task cost of all the assigned tasks in   .Our dual goal can be expressed as which aims to minimize the travel cost as well as balances the individual travel cost of each robot.Furthermore, we provide three criteria to evaluate the performance of different strategies: travel time (TT), which equals to the maximum of all the individual robots' travel costs; total travel cost (TTC), which reveals the total travel cost of all the robots, evaluates the power consumption and mechanical loss of robots; coefficient of variation (CV), which is a statistical measurement commonly used for comparing the diversity in work groups [19]: where  represents the standard deviation of ITC(  ,   ) and  is the mean of ITC(  ,   ),  = 1, 2, . . ., .
It is worth to point out that we use TTC to evaluate the power consumption of the system.On one hand, compared to the travel energy cost of the robots, other energy costs such as the communication cost and central controller cost are relatively small so that they can be neglected without affecting the overall power consumption.On the other hand, ∑  =1 (  ) (distance that robots running with pods) is constant for a system, which means the power consumption of this part is unchanged for a certain system.So we need to minimize ∑  =1 (   , 1 + ∑ −1 =1    , (+1) ) (distance that robots running without pods) which can be fulfilled by minimizing TTC.Thus, TTC can be used to evaluate the power consumption of the system.

Method
In this section, we develop the BHM based approach for the task allocation of multirobots for the intelligent warehouse.We firstly provide an overview of the whole control system.Then the BHM principles are described in detail.By introducing this mechanism into the traditional auction allocation method and the -means clustering method, two improved methods based on BHM are presented and analyzed for the task-allocation problem.

Overall System.
The work space of the warehouse can be divided into several grids with inventory storage zones in the middle surrounding with the stations and workers.Autonomous robots transport movable inventory pods from storage locations to stations, where workers can pick items off and pack them up; then those packages are sent to the customers.
In technical aspects, the task allocation process is regulated by the central controller while the path planning and motion planning, for example, the obstacle avoidance, are fulfilled by robots.In the task allocation process, many current allocation mechanisms tend to use the dynamic method [20,21]; however it may result in suboptimal performance.Thus, an approach is proposed combining both dynamic and static methods by means of the task pool.The continuously arriving orders accumulate in the task pool.The controller requires a certain number of tasks from the task pool after the previous batch of tasks has been completed, and uses a certain approach to allocate those tasks to robots.Then, the controller sends sets of tasks to corresponding robots by wireless communication such as ZigBee and the robots complete their tasks sequentially by a standard A * algorithm.Usually, we use the A * algorithm to find the shortest path between storage locations and stations in a static environment; while in an uncertain environment, the learning methods (e.g., reinforcement learning [22,23]) may be necessary to find the shortest path, which will be our future work and is beyond the scope of this paper.
In addition, as for the obstacle avoidance process, robots firstly use infrared detectors to detect other robots within certain distance (usually several grids in front of them).Then, they communicate with each other by wireless devices so that they can verify their relative priorities.Generally speaking, the robot with more tasks left to be completed has priority to occupy front grids.Therefore, the robot with fewer tasks left is likely to give way to the robot with more tasks to be completed.So the actual balancing performance  could be further enhanced in the real world compared with the simulated balancing performance of the simulated experiment.This mechanism is referred as more-task-prior mechanism in this paper.The whole process is shown in Figure 3.

Balanced Heuristic Mechanism (BHM).
Considering the goal of minimizing the travel cost as well as balancing the individual travel cost of each robot, a heuristic function is given in the following for guiding the task allocation process: where (  ,   ) describes the cost (considering both the time and distance) for robot   to finish task   , dist(  ,   ) is the travel cost from robot   to task   , and (  ) is the current value of (  ) (refer to Section 2).More specifically, as both task cost and travel cost from one location to another are unknown, we evaluate   (the cost of each task) and   (the travel cost from one location to another) before the task allocation by using the standard A * algorithm, a simple but significantly effective path planning algorithm.When a robot fulfills a task or moves from one location to another in real environment, more complex motions should be taken into account, such as the obstacle avoidance.Then, the more accurate value is shared with the central controller and the original estimated value stored in the controller is replaced with the newest value, which can help the controller to make a better task allocation.Thus the following allocation process is in the light of new information and can be more accurate and predictable.
BHM makes tasks to be assigned to the robot who has the lowest (  ,   ).By introducing a parameter , targets are more inclined to be assigned to the nearby robot with fewer allocated tasks.In practice, an effective value of  can be set between 0.7 and 0.9 according to the numbers of robots and tasks.The detailed relationship between  and the system performance will be given in Section 4.
At the beginning stage, as the value of (  ) is quite small, this mechanism has limited effects.Thus the performance of the system is not significantly superior to the one by traditional methods that merely focus on minimizing the total travel distance.However, as the allocation proceeds, the role of (1 − ) × (  ) comes into effect by sharing the workload with those who have fewer assigned tasks.Above all, the method can obtain an approximately optimal result by considering the total travel cost and the workload balance of robots.A sample illustration of this heuristic mechanism is given in Figure 4.

Application of BHM.
Recently the auction-based task allocation method and the cluster-based task allocation method have been widely used to solve the MRTA problems [18,24].So we introduce our BHM approach into the classic single-item auction method and the -means clustering method, respectively, and proposed two BMH-based methods to solve the balancing MRTA problem.

An Improved Auction Based Task Allocation Method
Using BHM.A typical seal-bid single-round single-item auction process generally consists of four steps: (i) task announcement, (ii) matrix evaluation, (iii) bid submission, (iv) close of auction (to determine the winning bids and notify the winning robot).
At the second stage, the matrix is usually defined as a function of the optimistic travel cost [11] as shown in the following matrix: C(r m , t 1 ) C(r m , t 2 ) where { 1 ,  2 , . . .,   } are all the open tasks which have not been carried out by robots.In our method, the BHM is introduced into the matrix evaluation process.In order to strengthen the balancing performance of the auction, the matrix evaluation is defined by    ,  rather than the optimistic travel distance between   and   .
In each round, after all of the robots submit their bids, the single-round single-item auction is closed.The controller will decide the winning bidder with the smallest bid value and notify the winner.Then the bid matrix will be refreshed with following steps (as shown in Figure 5).
Step 1. Delete the task   from the open-task set.
As described in (4), the value of (  ,   ) is both related to the distance between   and   as well as the already assigned tasks of   .Firstly, the current position of   is replaced by the location of   , since   wins the bid and has to move from its former position to   in order to carry out the order.Secondly, the (  ) increases by the cost of the newly allocated task, which can be mathematically expressed as (  ) = (  ) +   .After the matrix is refreshed, next single-item auction process is proceeding until all the tasks have been allocated.

A Balanced K-Means Clustering Method Using BHM.
A clustering method can be generally divided into two stages, that is, firstly finding the cluster centers and then assigning the sample data into the proper cluster.The -means clustering is a commonly used partitioning method [25], the main idea of which is to find  cluster centers ( 1 ,  2 , . . .,   ), which minimize  in the following by iteration: In (6), (  ,   ) represents the distance between the point   and   , known as the Euclidean distance.When  converges to a stable value, the appropriate ( 1 ,  2 , . . .,   ) are the cluster centers for those  subgroups.At the second stage, the sample data   is assigned into the proper cluster with the smallest (  ,   ),  = 1, 2, . . ., .
However, this method only guarantees an optimal allocation scheme minimizing the travel cost, while the size differences between each cluster are ignored.
In order to solve the balancing MRTA problem, we make some modifications on both of the two stages.As for the first stage, given that robots can only move along the grid lines, we substitute the Euclidean distance with the Manhattan distance in (6).In a rectangular coordinate system, the Manhattan distance between point  1 : ( 1 ,  1 ) and point  2 : ( 2 ,  2 ) can be expressed as Thus, the main purpose is to find  cluster centers ( 1 ,  2 , . . .,   ), which minimize  manhattan in the following by iteration: Then, to reduce the value  manhattan iteratively, we find the Manhattan center rather than the center of the collected tasks.Finally, a partial optimal solution is formed in a finite number of iterations.Generally speaking, there are three steps of the first stage.
(1) Initial step: give an initial set of  cluster centers.
(2) Assignment step: assign each point to the cluster center which has the shortest Manhattan distance.
(3) Update step: calculate the Manhattan Center, which can reduce the overall value of  manhattan .The Manhattan Center is defined as follows.
It is supposed that there are  points in a rectangular coordinate system as  1 ,  2 , . . .,   , coordinates of which are denoted as ( 1 ,  1 ), ( 2 ,  2 ), . . ., (  ,   ).Then, the descending order of the sequence  1 ,  2 , . . .,   is denoted as   1 ,   2 , . . .,    and the descending order of the sequence  1 ,  2 , . . .,   is denoted as   1 ,   2 , . . .,    .Our purpose is to find the point   = (  ,   ), which minimizes (8): Since the values of   and   do not affect each other, the value of and   ((+1)/2)+1 so that ∑  =1 |  −   | is minimum.We can get   in the same way.Therefore, the Manhattan center   can be calculated by following equations: 2  is even p oc (5,6) p mc (4,4) Original center  As mentioned before, the Manhattan center has the minimum overall Manhattan distance to these  points, by which the value of  manhattan is reduced.An example is shown in Figure 6 Now, we concisely prove the convergence of the algorithm.Given that the  manhattan generated by the algorithm is strictly decreasing and there only exist a finite number of such partitions, the algorithm will reach a partial optimal solution in a finite number of iterations.

Manhattan center
After obtaining the cluster centers, the balanced heuristic mechanism is introduced into the second stage of clustering.In traditional methods, when the  cluster centers ( 1 ,  2 , . . .,   ) are found in the first stage of the -means algorithm, the remaining active tasks are allocated into the proper cluster with the smallest (  ,   ),  = 1, 2, . . ., .However, in our approach, remaining active tasks are allocated into the proper cluster with the smallest (  ,   ) defined in (4).
The main idea is that, instead of merely considering the travel cost, the balance of the cluster size is taken into account.The task tends to be allocated into the nearby cluster with fewer tasks.When a new task is added into the cluster , the (  ) will be refreshed.After all the active tasks have been allocated, the genetic-based TSP algorithm [26] is used to plan the route for tasks in each cluster.One example of the task allocation is demonstrated in Figure 7 ( = 4,  = 50,  = 0.8), where different colors represent tasks allocated to different robots.Blocks with  are clustering centers.Step 1. Initialization.
Step 2. Creating orders in the task pool while the arriving rate of orders meets the Poisson distribution.
Step 3. Sending a specific number (e.g., 200) of orders from the task pool to the controller.
Step 4. Using different methods to allocate the orders (tasks).More specifically, there are four approaches, namely, means, BHM -means, auction, and BHM auction, and their performances are compared.
Step 5. Sending the allocated tasks by the controller to the corresponding robots, which begin to fulfil their tasks (using the A * algorithm to calculate the path and using the moretask-prior mechanism to avoid collision).Then go to Step 3.

Simulation Results
Analysis.In this part, three criteria mentioned in Section 2 are used to evaluate the performance of the proposed balanced heuristic mechanism, that is, coefficient of variation (CV), travel time (TT), and total travel cost (TTC).First of all, we compare the balance behavior of the improved methods with the typical auction method and the -means method, results of which are given in Tables 1  and 2, where the data are collected by average during fifty times of simulation experiments.Both Tables 1 and 2 show that using the proposed BHM, the diversity degree in work groups is significantly reduced, which indicates the more equal workload distribution among robots.
As shown in Figures 8 and 9, the travel time (TT) of the multirobot system is provided by both the traditional method and the improved method.We can conclude that by applying the BHM based methods, the time spent to finish the given task is much less than the one by using the traditional method, due to an efficient sharing of workload.
Comparison results for traditional methods and BHM based methods are demonstrated in Figure 10 by means of the total travel cost (TTC), from which we can conclude that in balanced heuristic cases, although robots pay more attention to balancing the workload rather than merely minimizing the travel cost as traditional methods do, the TTC only increases by about 8%.Concerning the obvious increasement of the robots utilization, this slightly increased power consumption is quite acceptable.
With the traditional auction method and -means method, the task allocation process only concentrates on minimizing the travel cost.Once there is a robot nearest to an open task, other robots do not attempt to compete for the task even if they are idle.Thus the utilization of the system is not enough and the time cost to fulfill the given tasks is high.With BHM based methods proposed in this paper, all robots can be utilized in a balanced and effective manner while the travel cost is quite low.Then we study the influence of the variable  on the system performance.The value of  describes the weight of the system energy saving demand and time saving demand.Figure 11 demonstrates the relationship between  and the travel time (TT).It shows that in order to keep a good balance between minimizing the travel distance and efficient sharing of the workload, the coefficient is usually set between 0.7 and 0.9 to achieve the minimum of TT.
Finally, besides comparing BHM-based approaches with the traditional auction method and -means method, we use a more competing algorithm (CAB mechanism in [18]) to testify the scientific merit of the BHM scheme (as shown in Figures 12 and 13).We can conclude that by BHM based methods, the time spent to finish the given task is 5∼15% less than the CAB mechanism; meanwhile, their total travel costs are almost the same.Thus, the benefit of the balanced heuristic mechanism to warehouse systems is significant.

Conclusion
In this paper, a novel balanced heuristic mechanism for the multirobot task allocation problem is presented with respect to the task cost and travel cost.To evaluate the proposed approach, we give three criteria, namely, travel time (TT), total travel cost (TTC), and coefficient of variation (CV), and compare some traditional methods with BHM-based methods under those criterions.Simulation results show that the proposed approach greatly enhances the balance behavior of the multirobot system, reduces the total travel time while achieves almost the same total travel cost as the traditional methods do.In addition, the influence of the variable  on the system performance is also studied in detail.In our future work, BHM will be applied into more complicated environments, which consist of heterogeneous tasks and robots.Furthermore, we will dedicate to enhance the robustness of the proposed approach in dynamic environments.Finally, it is worth to point out that this new mechanism is of great  applicability to other task allocation problems, such as the WSANs problem in [27].

Figure 1 :
Figure 1: Configuration of a warehouse system.

Figure 4 :
Figure 4: Example of the balanced heuristic mechanism.

Figure 6 :
Figure 6: Example of the manhattan center.

Figure 7 :
Figure 7: Example of the clustering result.
Tasks {T 1 } assigned to r 1 Tasks {T 2 } assigned to r 2

Table 1 :
Balance behavior of single-item auction versus BHM auction.

Table 2 :
Balance behavior of typical -means versus BHM means.