Task Allocation and Path Planning for Collaborative Autonomous Underwater Vehicles Operating through an Underwater Acoustic Network

Dynamic and unstructured multiple cooperative autonomous underwater vehicle (AUV) missions are highly complex operations, and task allocation and path planning are made significantly more challenging under realistic underwater acoustic communication constraints. This paper presents a solution for the task allocation and path planning for multiple AUVs under marginal acoustic communication conditions: a location-aided task allocation framework (LAAF) algorithm for multitarget task assignment and the grid-basedmultiobjective optimal programming (GMOOP)mathematical model for finding an optimal vehicle command decision given a set of objectives and constraints. Both the LAAF andGMOOP algorithms are well suited in poor acoustic network condition and dynamic environment. Our research is based on an existing mobile ad hoc network underwater acoustic simulator and blind flooding routing protocol. Simulation results demonstrate that the location-aided auction strategy performs significantly better than the well-accepted auction algorithm developed by Bertsekas in terms of task-allocation time and network bandwidth consumption. We also demonstrate that the GMOOP path-planning technique provides an efficient method for executing multiobjective tasks by cooperative agents with limited communication capabilities. This is in contrast to existing multiobjective action selection methods that are limited to networks where constant, reliable communication is assumed to be available.


Introduction
Autonomous underwater vehicles (AUVs) represent one of the most challenging frontiers for robotics research.AUVs work in an unstructured environment and face unique perception, communication, and control difficulties.Currently, the state of the art in mission planning is dominated by single AUV operations using preplanned trajectories with offline postprocessing of the data collected during the mission.Multiple cooperative vehicle systems (MCVSs) hold great promise for use in large-scale oceanographic surveys, mine countermeasures (MCMs), and other underwater missions, due to better resource and task allocation [1][2][3].
Simultaneous use of multiple vehicles can improve performance, reduce mission time, and increase the likelihood of mission success.It is not necessary for all the vehicles in an operation to be the same.In fact, heterogeneity could become a powerful driver of multiple AUV (MAUV) operations.
Instead of using a single vehicle or a homogeneous fleet able to perform every possible mission, a fleet could comprise a variety of AUVs [4].Different missions would be accomplished using different combinations of vehicles.The key to obtaining the greatest benefit from MAUV or combined operations is cooperation.Cooperative strategies among multiple underwater vehicles can be complex.In addition, communication is a critical aspect of vehicle cooperation and must not be trivialized, as underwater communication is notoriously difficult, slow, and limited in range [5].
The work presented here addresses the two closely coupled problems of multivehicle underwater operations, which are (1) how to assign tasks to each vehicle efficiently and robustly, and (2) how to ensure rapid and effective vehicle actions.
The main contributions of this article are the following: (1) first are the design and simulation of a path-planning controller that efficiently handles the cooperative operations 2.1.Problem Statement.Our mission interest is to search and classify underwater targets.In this mission, two classes of vehicles are involved.Every search vehicle (SV) is equipped with wide field of view target detection sonar.These sensors cover a large volume of water but do not have sufficient resolution to discern the specificity of the targets.In contrast, every classify vehicle (CV) is equipped with a narrow fieldof-view, high-resolution sonar.These sonar systems cover a small volume of water with sufficient resolution for target classification.In this work, the main focus is the task acquisition and execution of the CVs' mission.The target distribution is assumed to be known a priori to SVs.Similar approach of using heterogeneous robots in a cooperative mission has been demonstrated [4,7,8] and viewed as a feasible and cost-effective means to carry out search and classify missions.

Underwater Acoustic Communication.
Cooperative control and optimization for multiple collaborative autonomous underwater vehicles has become a vital area of research [1,2,4,6].Although some of the prior work are closely related to this research, few address the same problems specifically related to underwater environments with realistic acoustic communication constraints [9][10][11].Underwater acoustic networking is an enabling technology for various collaborative missions involving multiple autonomous underwater vehicles [5].Unfortunately, underwater acoustic communication is slow and limited in range, which impacts MCVS controllers.
The proposed research problem is significantly different from those reported using unmanned aerial vehicles (UAVs) as the intervehicle communication can only be achieved via sound propagation [12,13].In addition, all UAVs are mostly equipped with high precision navigation sensors onboard (with GPS) whereas the navigation of most underwater vehicles is still based on the traditional dead-reckoning techniques or motion sensors' fusion.

Dynamic Task Allocation.
In dynamic task allocation, every robot assignment is dynamically adjusted with changes in the environment or group performance.While coordination algorithms for task allocation use only local sensing [14], these algorithms do not take advantage of the underwater acoustic network capability and formal analysis tools are lacking.
Task allocation models can be classified as either centralized or distributed.In centralized models, a central agent exists and plays the role of arbitrator.This arbitrator aggregates information from their team members, plans optimally for the entire group, and finally propagates the task assignments to other team members.This model has the advantage of finding the optimal solution, but it generally falls short in harsh underwater communication and partially observable environment as these controllers cannot handle tight MCVS coordination due to limited flexibility and long response time [6,15].
The distributed task allocation approach handles this shortcoming [16,17].In a distributed multirobot system, each robot operates independently under local sensing, with high-level system coordination arising from interactions with other robots as well as the task environment [18].In this approach, agents rely on a predefined negotiation framework that allows them to decide what activity to do next, what information to communicate, and to whom.A difficulty with this approach is that it requires agents to possess accurate knowledge about their environment and assumes that consistent communication among agents is available throughout the mission.Both conditions are difficult to maintain in heterogeneous underwater MCVS.
Task allocation among multiple robots is commonly accomplished by market-based methods, which are tested as one of the most successful solutions [12,[19][20][21].In traditional task assignment problems, market-based approaches are widely used due to low computational complexity.The original form of auction algorithm was first proposed by Bertsekas [19,22].This algorithm operates whereby unassigned agents bid simultaneously for objects thereby raising their prices.The auction algorithm is widely used in task allocation for cooperating UAVs.In [23], Nygard et al. present a network flow optimization model for allocating UAVs to targets.This problem is formulated as a linear programming problem to obtain decisions for allocating the UAVs.The network model has been further studied by Schumacher et al. [12] and Mitchell.In [24], the authors present a team theoretic approach that allows UAVs to perform decisionmaking independently whenever UAVs cannot exchange information.
Typically, the task allocation problem divides a mission into individual subtasks and assigns robots to each subtask.Auctions are powerful tools for allocating resources effectively, especially in situations where optimal allocation is expensive to achieve or where the environment is not completely known.
Another well-accepted auction algorithm, effective for general multiagent task allocation, is Challenger [18].The difficulty with this auction algorithm is that it becomes stagnant when cluster-typed targets exist in the search space and it easily fails in harsh underwater environment.It is further challenged in an underwater acoustic mobile ad hoc network (MANET) due to limited network capacity and high latency [25].
As the size of system grows or the network condition deteriorates, market-based approaches are not applicable as the increased demands in communication bandwidth.This leads to emergent coordination through negotiation or selforganized mechanisms.In such methods, individual robots coordinate their actions based solely on local sensing or local interactions.The advantages and disadvantages of the three task allocation methods are listed in Table 1.

Path-Planning and Multiple-Objective Optimization.
Path planning typically involves defining vehicle waypoints based on the minimization of a positive cost metric (typically a function of mission time or energy consumption), under certain constraints and objectives.In [27], a motion control algorithm is proposed, which constructs adjoint equations based on the vehicle velocity, the time to reach the target, and the optimal usage of the battery life.
Cooperative strategies among heterogeneous vehicles are difficult to devise.MCVSs with heterogeneous agents need an advanced formation structure, a dynamic task, and resource allocation algorithm as well as powerful techniques to construct and optimally solve a path-planning problem.Additional complexity arises from the challenges of limited navigation and communication capabilities in the underwater environment.
For a network of vehicles to communicate, additional constraints, such as transmission delay and throughput, would need to be considered.In this case, queue length and bandwidth efficiency must be accounted for.With the objective to maximize throughput and queue length, a receding horizon control algorithm is proposed in [28], which yields unique, piecewise continuous optimal controllers that limit and route traffic.
There has been significant work reported in MCVS architecture applicable to multiple AUVs as well as MCVS simulation platform [13,[29][30][31][32].However, the research findings and conclusions from these works are based on the assumption of idealized underwater communication conditions.In [32], a simulation environment for the coordinated operation of multiple autonomous underwater vehicles is presented with five clearly defined control architecture layers: the physical layer, the abstraction layer, the functional layer, the coordination layer, and the organization layer.One limitation of this simulator is its model of the acoustic environment: simple time delays are used and no message routing is considered in this simulation platform.
Extensive research work has been done on cooperation and coordination of mobile agents.Traditional methods for cooperative underwater vehicles include swarmed cooperative schemes for homogeneous vehicles and low-level cooperative or merely coordinated techniques for simple task accomplishment [27,29,33].For the coordinated maneuver of multiple vehicles, [27] presents a control maneuverintegrated acoustic navigation system for a formation of three AUVs and one surface craft to gradient searching and following missions.In [30], the authors present a theoretical study of the coordination of the geometrical movements of one flotilla of autonomous underwater vehicles.However, these techniques are insufficient for mission planning when a fleet of vehicles operate in a MANET with a very limited communication bandwidth.
To better address the dynamic system with behaviorbased strategies, an interval programming (IvP) method is described in [34,35] to solve multi-objective optimal problems: a mathematical programming model finds an optimal decision given a set of competing objective functions.

Location-Aided Task Allocation Framework Protocol
We propose a location-aided task allocation framework (LAAF) that addresses these challenges by extending the radio network auction algorithm developed by Chavez [18] and Bertsekas [19] to the AUV network.LAAF uses three types of strategy: centralized, negotiation, and self-organized [26].This approach does not guarantee an optimal allocation, but it is especially suited to dynamic environments, where execution time might deviate significantly from estimates and where the ability to adapt to changing conditions is the key to success.In this framework, each robot considers its local plans when bidding and multiple targets can be allocated to a single robot during the negotiation process.The rest of this section provides a description, within the context of a searchclassify AUV mission, of the LAAF auction algorithms.

Cost Function.
The mission of interest is to search and classify targets with a prior-known distribution.Each target is modeled as a step utility function and considered to be classified when it is within the sensor range of a CV.Assume that   is the time at which vehicle  reaches its final waypoint.A cost function that penalizes the maximum completion time (to cover all targets) is traditionally assigned to each vehicle.Unfortunately, such cost function may lead to exceedingly long trajectories for most of the vehicles.To avoid this problem, the authors minimize a cost-to-the-fleet (CTF) function assigned to the SVs and CVs working cooperatively to identify the targets: Here   is the positively defined cost incurred by CV  to service target  and is defined as the time needed by this CV to reach the target (including foreseeable vehicle collision avoidance).  is the positively defined cost of allocating the optimal CV to classify the target .  is a function of two variables computed in the auction algorithm (Section 3.2): the Task Allocation Round Time (TART) and Effective Task Allocation Time (ETAT).Both TART and ETAT depend on the network conditions.TART and ETAT are computed by the simulation tool using the simulator clock.TART is calculated from the time a new classify task is announced to the time that the winner acknowledge message reaches the auctioneer.The role of the auctioneer is assigned to SVs as they carry long-range search sensors.Compared to TART, ETAT ends when the self-determination process is complete.If the task allocation keeps failing due to exceedingly unreliable communication, the auctioneer will stop the task allocation process past a predefined timeout.If the task allocation timeout is reached, TART is set to a very large value (equivalent to infinity) [26].The relationship between   , TART, and ETAT is TART if all bids are successfully received, ETAT { if the auctioneer times out and at least one bid is received. ( The output action commands generated using the GMOOP model produces the local optimal value of   .The corresponding messages are designed to use as few data bytes as possible [36].The message size is labeled Bytes for Task Allocation (BTA).[19,22] is used as a reference in our analysis.We will refer to this algorithm as the generic auction algorithm from now on.This algorithm determines which robot can best complete a given task, based on their proposed bids.The protocol requires a single controller and an auctioneer.For each auction, the auctioneer needs prior knowledge of the number of bidders and the maximum expected round-trip time of a message in the network.This protocol is characterized by the following assumptions.

Generic Auction Algorithm. The well-accepted auction algorithm developed by Bertsekas
(1) A single vehicle can be an auctioneer, a bidder, or neither.
(2) Only one type of tasks can be auctioned among a given set of auctioneers and bidders, and all auctioneers and bidders have a prior knowledge of all task types.
(3) All bidders have a cost function which accepts the auction data and returns a bid.The lowest bid wins the auction.
(4) Network broadcast is available, so that a single transmission may reach all the nodes in the network.
Each bidder replies to each auction announcement with a bid.The bid is an estimate of the time required for the vehicle to accomplish the task, which consists of the time needed to complete all equivalent or higher-priority tasks plus the transit time to the Area of Interest (AOI).Upon winning an auction, a bidder adds the AOI to its task queue.When an AOI task becomes active, the bidder proceeds directly to the AOI.Once in the AOI, if a target is successfully identified, the task is marked as completed.If a search over the entire AOI produces no positively identified targets, the task is marked as incomplete.

Location-Aided Auction
Framework.The reference auction algorithm may fail occasionally due to the unreliable underwater acoustic network.In this paper, the simulated acoustic modem (FAU DPAM) performs poorly beyond 4000 m.In the simulator, the medium model will consistently return a Frame Error Rate (FER, or message error rate) close to 100% beyond 4 km [37].As shown in Figure 1, even under best-case conditions, the maximum useful range of the modem is not expected to exceed 4096 meters.Therefore, a moving classify vehicle may not be able to return a bid during an auction due to poor communication.Note that the performance curves shown in Figure 1 have been derived by fitting a Nakagami-m model with actual field data [37].These data were collected under environmental conditions closely resembling those that used the simulation.

Overview.
A solution to this problem is to add a self-decision capability to each classify vehicle, based on its knowledge about the overall fleet topology.Each CV accepts or denies the target assignment according to its location relative to the target, any known locations of other vehicles, and any neighbor decision received via the acoustic modem.Once decisions have been made, the same CV will disseminate the results to the entire network.
The benefits of self-determination are that a CV can (1) always attempt to classify a target if needed, (2) utilize the available topology information periodically updated as the mission processes, and (3) reduce the need for underwater communication and, in turn, the response time between detection and classification.None of these features is possible in a generic auction cycle.The shortcoming of this self-determination mechanism is that multiple vehicles may attempt to classify the same target, due to the limited accuracy of the topology information.This can be alleviated by the fact that as the classify vehicles are approaching the same target, the acoustic communication improves.Therefore, the classify vehicles have a more accurate knowledge of the topology and decide whether to abort the target classification.
Based on these observations, a location-aided auction framework (LAAF) has been developed.LAAF is an extension of the generic auction algorithm that incorporates a negotiation among all bidders and utilizes the available topology information of the local vehicle.It is designed to meet the challenges of high latency and limited bandwidth of the acoustic network linking the underwater agents.LAAF uses a master-slave architecture which handles most of the allocation work through the acoustic network.If the agents are too far apart to communicate, they identify their individual tasks by reasoning on their available world information.Figure 2 gives an overview of LAAF: when a search vehicle discovers any new target(s), the search vehicle starts an auction for the task(s).It broadcasts an auction announcement message that has a header field to distinguish between single and multiple items.
In a scenario of a single-item auction, once the auctioneer has received bids from all bidders, it chooses the best bid and sends out a winner notification message to all the bidders.The auctioneer then waits for the winner-acknowledge message from the winner and closes the current auction.If the auction times out before the auctioneer receives the winner acknowledge message, the auction will be reinitiated.
In the multi-item scenario, a bid message received by the auctioneer contains the optimal configuration for classifying the targets by a set of vehicles, and the auctioneer simply broadcasts the winners to the whole network.In this case, the winner notification is not necessary in LAAF as ETAT is used to evaluate the task assign performance.There are two possibilities that an auctioneer transitions to an auction close state: (1) the auctioneer successfully received a winner acknowledge message, so the target(s) are successfully assigned through auction process; (2) due to poor network conditions, the auction times out and the classify vehicles are left to determine their own tasks based on their knowledge of the fleet topology.

LAAF Bidder Agent's Policy.
Algorithm 1 shows the LAAF algorithm used by a bidder to respond to bid requests from an auctioneer.When a bidder is in the bidding state, it keeps track of the current cost to classifying all the known unclassified targets (base cost), which is added to any further bid calculations.The policy of the bidder in the case of singleitem auction is intuitive; it adds the cost to classify the new task to its base cost and sends it back to the auctioneer.
Since the time slot allocated for each node by the Medium Access Control (MAC) is fairly long (typically 5 seconds), LAAF processes multi-items within an auction cycle.In comparison, single-item task allocation generally requires more auction cycles to complete the task allocation.When the bidder agent receives a multiple-item auction request, it selects the target(s) to bid for.An intuitive solution is to compute all the possible combinational bids and send the bids information back to the auctioneer to select the winner(s).The problem with this method is that for  concurrent targets, each bid message contains (, 1) + (, 2) + ⋅ ⋅ ⋅ + (, ) combinational bids, where (, ) is the permutation function [36].This approach requires a large amount of data to be passed between the bidder and the auctioneer in a slow underwater acoustic network.
To overcome this problem, each bidder initiates a negotiation process with other bidders.When a bidder receives a multiple-item auction, it calculates a bid for each target in the task announcement and disseminates the bid information through the network.In this case, only  bids are disseminated by each classify vehicle.Once a bidder has collected enough negotiation messages from other bidders, it determines the optimal combination of targets' assignment.By finding the minimum value of all the combinational bids, this bidder finds the optimal list of vehicles to accomplish the multiple-target auction.
At the end of this auction, the auctioneer acknowledges the winner.In case of an acknowledgment timeout, the auctioneer reannounces the auction (targets) and starts a new auction process with the same target(s).
Simulation results show that when multiple classify tasks are acquired, a CV should reschedule the task list to achieve optimality [36].For example, if a CV is on its way to classify a remote target and receives a new task near the current route, this vehicle should adjust the priorities in its task list.

Path-Planning and Multiple-Objective Optimization
The GMOOP model is designed to find the optimal solutions for a search-classify mission using an action determination map subject to static and dynamic constraints and objectives [36].Derived from the IvP algorithm [35], the GMOOP model has a similar solution strategy of representing and optimizing over multiple competing objective functions.The main differences are that GMOOP model uses grid maps and addresses the impact of unreliable underwater acoustic communications.In particular, the output of a GMOOP model is the optimized values for command variables.
From (1), we know that the general expression of the search-classify mission overall cost to the fleet is CTF, where   is the price paid to find the optimal classify vehicle for target .Thus, in the path-planning problem only, the objective is to minimize the cost metric   , which is the cost incurred by classify vehicle  to service target .
The problem space can be formulated as a set of states denoting vehicle locations connected by directional arcs, each of which has an associated cost.The vehicle starts from an initial state (at the start location) and moves across arcs to other states until it reaches the goal   .A potential field is defined between every combination of neighboring states  and  to calculate the cost (, ) of traversing an arc from state  to state .The optimal neighbor state  * is found by minimizing this cost function.The minimum cost function is the variable   used in (1).

Model Elements.
The GMOOP algorithm (Figure 3) uses a set of functions that address specific behavior patterns.These functions are defined over an action variable space or action determination grid map.Grid maps are widely used in vehicle localization and environment estimation [38][39][40][41].The GMOOP algorithm produces an action determination grid map which results from the weighted combination of a series of objective functions.These behavior functions are constrained by the objectives of every vehicle.Collision avoidance from nearby vehicles and fast target engagement are two typical behavior patterns.
Each behavior pattern uses a mathematical function of the relative distance between two vehicles or between a vehicle and a target [36].The action determination grid map contains the value of a specific objective function given a vehicle velocity vector.The variation in time of such a map can be used to predict the trajectory of the vehicle.Each vehicle also carries a vehicle distribution grid map, which contains the location of other vehicles in the search space and areas already covered by a vehicle.Due to uncertainty in the navigation information and acoustic communication, the position information for each vehicle is the combination of a mean value and a probabilistic Rayleigh model.
The objective function, defined by the command variables, is applied to the action determination grid map and produces ( is a behavior function, designed following the potential field approach [42]: neighbor vehicles exert repulsive forces, while the target applies attractive force to the robot.  is the weight for this behavior.(, ) represents the north and east component of the vehicle velocity in geographical coordinate.

Multiple Autonomous Underwater Vehicle Control Model Used by GMOOP
The mission assigned to an underwater vehicle impacts the underwater navigation and underwater communication performance.The simulated missions emphasize rapid target classification using cooperating AUVs which are operating in a congested coastal area.Therefore, collision between vehicles is a real concern.The scenarios considered in this section emphasize the continuous trade-off between collision avoidance, remaining within communication range, and target engagement.

GMOOP Control Loop.
Here we consider the case of a single vehicle controlled by the GMOOP controller.The required operation involves a vehicle moving through time and space, where periodically, at fixed time intervals, a decision is made as to how to control the next move of the vehicle.As depicted in Figure 4, the next decision occurs at time   , while the output decision   is computed by building and solving the GMOOP problem in the time interval The control loop builds and solves the GMOOP model iteratively (Figure 5).Each objective function is defined over a common decision space where each decision precisely spells out the next action for the vehicle to implement starting at time   .

Vehicles and Environment.
The mission of interest is to search and classify underwater targets.Search vehicles act as auctioneers while classify vehicles act as bidders in the task planning process.Both the search and classify vehicles use the GMOOP controller with a group of different predefined behaviors    () (Figure 3) defined as The vehicle task information includes (a) search or classification information from onboard sensors, (b) destination information assigned by the task auctioneer, (c) other vehicle's actions and position information obtained through the acoustic network, and (d) the vehicle position.

Simulation Results
6.1.Simulator.The simulation tool is the MANET simulator presented in [25,26,37], upgraded to support MCVS operations.Several routing protocols can be used in the MANET [25], if necessary.The simulator requires a breadth of inputs, including the number of vehicles, vehicle types, vehicle identification and initial position, target distribution, communication performance, environmental conditions, and priority of various behavior patterns.
The event-driven simulator uses a dedicated process, labeled world, to maintain the clock and environment objects, the true vehicle models, target models, and the sensor (perception) models.Each vehicle object represents an autonomous entity in the environment.The vehicles are created by the user, separate from the simulated environment, and are added to the world before the simulation begins.The world makes the necessary connections between the vehicle systems and the environment.A vehicle requires a helm to maneuver, a protocol stack to communicate, and a sensor to detect objects in the environment.
Figure 6 shows a sequence diagram between missionand path-planning controllers and other vehicle modules.It depicts a sequence of messages between multiple modules.The new task has a forward flow from vehicle sensor model to path-planning controller via the mission-planning controller.This new task decision is made by the mission-planning controller based on local vehicle sensor detections as well as by group state from network channels.

Simulation Using Reliable Acoustic Communications.
The target search space is a 1000-by-1000-meter box.The simulation results presented in this paper use one SV and two CVs with a predefined number of targets.The acoustic modem MAC uses time division multiple access (TDMA): every vehicle has a preassigned 5-second transmission slot.Since three vehicles are present in this network, each vehicle transmits information every 15 seconds.We assume that the acoustic modems are operated under mild conditions, that is, at full power with background noise PSD of 55 dB re 1 Pa/√Hz and some fading (represented by the Nakagami coefficient  = 2) [37].This would correspond to operating  the vehicles in shallow waters over sandy bottom in calm seas.We also keep the relative distance between any two vehicles at any given time to be less than 300 m.As a result, messages are almost guaranteed to reach their destination (the simulated FER versus range is shown in Figure 1(a)).The results presented here do not require any data routing, as the vehicles always remain within reliable communication range of one another.
The SV is configured to broadcast target announcements to the CVs at specific times, simulating the detection of new targets.Once a CV identifies an unaccomplished target mission in its list, the GMOOP path-planning controller generates an optimal control action every second, with the objective of approaching the target as quickly as possible under the constraint of collision avoidance.

Performance Comparison between Auction Algorithms.
Figure 7 shows the specific trajectories of a search-classify mission when the two different auction algorithms are used.The CVs have preassigned classification tasks (T1 and T2 for CV no. 1 and CV no. 2, resp.).At the mission time of 8 seconds, the search vehicle (SV) identifies two new targets T3 and T4 simultaneously.The two-target classify tasks are announced sequentially in the case of a generic auction algorithm and are announced simultaneously in the case of LAAF.In the case of a generic auction algorithm, CV no. 1 wins both classify task bids.Both new tasks are added to CV no.1's task list and CV no. 2 stops as it accomplishes the classify task of T2.In the case of the LAAF, the negotiation result is that the concurrent new tasks of T3 and T4 are assigned to both classify vehicles.Thus, CV no. 1 continues to approach to T4 after finishing servicing T1 and CV no. 2 heads for T3 after servicing T2.
The differences in using the two auction algorithms are further illustrated in Figure 8.If the generic auction algorithm is used, the two successive auction processes are shown using black and red arrows, respectively.If LAAF is used, only one cycle of task allocation is needed, and the bidders negotiate to select the targets to service (the negotiation processes are highlighted in red in this case).The simulation results are given in Table 2.All three metrics clearly show that LAAF beats the generic auction algorithm and leads to a quicker completion of the mission.

Path-Planning Results.
In search and classify missions, there are occasions where the AUVs are so close to one another that collision becomes likely.The scenario considered in this section centers around the need to keep a minimum standoff distance while simultaneously transiting to a destination as quickly and directly as possible.A GMOOP problem is created and solved every second through the control loop.The output of each action determination is a commanded vehicle speed vector, with components in the north and east directions.
Figure 9 shows a best-case scenario in which a fourtarget mission is to be completed by two classify vehicles and one search vehicle.The search vehicle (SV) is assumed to remain at the same location throughout the mission and broadcasts the four targets' location information at certain predefined times.The first and second targets (labeled Tar no. 1 and Tar no. 2) are announced simultaneously by the search vehicle at  = 8 seconds.Subsequently, the third      9, all four targets are allocated to the "best" candidates and are classified under the GMOOP controller.The metrics for each classified target are listed in Table 3.Although two targets are scheduled by the search vehicle at the same time in our scenario, they are announced as independent tasks and broadcasted separately to the bidders.In this scenario, there is a time lag of 30 seconds (due to the acoustic modem MAC) between two successive targets announcement.The winning bidder of Tar no. 2 is CV no. 2, indicating that when the two classify vehicles send their bids, CV no. 2 is closer to the bidding target.The total mission time is 214 seconds.
The GMOOP-controlled trajectories turn out to be simple in this scenario, because the two classify vehicles keep a relatively large distance from each other.Thus the only objective considered in the GMOOP model is to transit to the target as quickly as possible.

Simulated Impact of Using Unreliable Acoustic Communications on GMOOP and LAAF.
We now assume that the acoustic modems are operated under more severe conditions, that is at full power with background noise PSD of 85 dB re 1 Pa/√Hz and severe fading (represented by the Nakagami coefficient  = 1.5) [37].Operating the vehicles over a shallow water reef in rough seas is a good example.We still keep the relative distance between any two vehicles at any given time to be less than 300 m.As a result, messages are much less likely to reach their destination (the simulated FER versus range is shown in Figure 1(b)).
Here, we only use LAAF and GMOOP in the simulations.Figure 10 shows scenarios where two classify vehicles are tasked to classify four targets.Eight simulation runs were carried out using the same initial scenario state; four runs are shown.Once again, each bid is sent once (no retry).The initial task allocation messages between SV and CVs are sketched using black lines.We find that the entire mission is completed so long as the four classify tasks reach the CVs.From Figure 10(b), we note that the trajectories of both classify vehicles are the same as shown in Figure 9 (with the best environmental conditions).Figures 10(b The performance metrics for each target are listed in Table 4. Compared to Table 2, we note that, for Tar no. 2 and Tar no. 4, there is a great increase in terms of TART and ETART but a decrease in BTA.This is because the winning bidder cannot receive the winner notification message due to the poor network condition.After the predefined timer expires, the classify vehicles switch to the self-determination mode defined in the LAAF auction architecture and finally pick targets on its own.

Conclusion
In this paper, the authors introduced GMOOP and LAAF.The GMOOP model is intended to balance the competing objectives and constraints for a vehicle in the mission.The LAAF task allocation algorithm is specially designed for the harsh underwater network.The GMOOP and LAAF algorithms are implemented in the mission-and path-planning module of a simulator.The simulation results show that the location-aided auction strategies performed significantly better than the generic auction algorithm in terms of effective task allocation time and bytes usage.In a simplified task allocation case, the EFAT was reduced by 76.1% and the bytes used for task allocation were reduced by 16.7% due to the self-determination mechanism and multiple-target handling technique.These improvements are essential for communication-based underwater operations in which mission time and bandwidth are critical criteria.In addition, LAAF could still be used with some level of success if underwater acoustic communications become unreliable.

Figure 4 :Figure 5 :{
Figure 4: The time interval of a decision-making cycle.

Figure 6 :
Figure 6: A sequence diagram of mission and path-planning controllers.

2 𝑦 2
-axis, for east direction (m) -axis, for north direction (m) Trajectory of CV no.1 Trajectory of CV no.2 Trajectory of SV no.1 Auction-related messages for T3 Auction-related messages for T4 Trajectory of CV no.1 Trajectory of CV no.2 Trajectory of SV no.1 Auction-related messages for T3 and T4 Auction-related messages for T3 and T4 -axis, for east direction (m) -axis, for north direction (m) (b)

Figure 7 :
Figure 7: Trajectories for search-classify mission using a generic auction algorithm (a) and LAAF (b).

Figure 8 :
Figure 8: Communication messages for task allocation.

Figure 9 :
Figure 9: Four targets mission with 2 classify vehicles and 1 search vehicle.
) and 10(d) show two successful mission, while Figures 10(a) and 10(c) show two failed missions due to timeout.In Figure 10(a), the search vehicle is able to transmit the Tar no. 1 location information to CV no. 1 at  = 12.7 seconds, but this bidding announcement does not reach CV no. 2. Thus the winning bidder for Tar no. 1 is CV no. 1.At  = 42.7 seconds, the bidding announcement successfully reaches both bidders, and CV no. 1 wins again in this bidding due to a closer range to Tar no. 2. The bidding announcements for Tar no. 3 and Tar no. 4 do not reach any of the CVs, causing the mission to fail.
, for east direction (m) -axis, for north direction (m) (a) Trajectories for search-classify mission , for east direction (m) -axis, for north direction (m) (b) Trajectories for search-classify mission , for east direction (m) -axis, for north direction (m) (c) Trajectories for search-classify mission , for east direction (m) -axis, for north direction (m) (d) Trajectories for search-classify mission

Figure 10 :
Figure 10: Four executions of the same mission using 4 targets, 2 classify vehicles, and LAAF when acoustic communication is unreliable.

Table 2 :
Comparison between generic auction and LAAF, best-case scenario.

Table 3 :
Task allocation metrics evaluated for each target, best-case scenario.

Table 4 :
Allocation metrics evaluated for each target, worst-case scenario amongst successful missions.