Cloudroid Swarm: A QoS-Aware Framework for Multirobot Cooperation Offloading

Computation offloading has been widely recognized as an effective way to promote the capabilities of resource-constrained mobile devices. Recent years have seen a renewal of the importance of this technology in the emerging field of mobile robots, supporting resource-intensive robot applications. However, cooperating to solve complex tasks in the physical world, which is a significant feature of a robot swarm compared to traditional mobile computing devices, has not received in-depth attention in research concerned with traditional computation offloading. In this study, we propose an approach named cooperation offloading, which offloads the intensive communication among robots as well as the computation for compute-intensive and data-intensive tasks. We analyze the performance gain of cooperation offloading by formalizing multirobot cooperative models; in addition, we study offloading decisions. Based on this approach, we design a cloud robotic framework named Cloudroid Swarm and develop several QoS-aware mechanisms to provide a general solution to cooperation offloading with QoS assurance in multirobot cooperative scenes. We implement Cloudroid Swarm to transparently migrate multirobot applications to cloud servers without any code modification. We evaluate our framework using three different multirobot cooperative applications. Our results show that Cloudroid Swarm can be applied to various robotic applications and real-world environments and bring significant benefits in terms of both network optimization and task performance. Besides, our framework has good scalability and can do support as many as 256 robot entities simultaneously.


Introduction
The idea of offloading computation from resource-constrained devices to external platforms (e.g., the cloud) emerged in the field of mobile computing due to the limited computational power, storage, and energy of the mobile device [1]. More recently, the idea has also grown in popularity in the mobile robot community because achieving "autonomy" on a mobile robot usually involves intensive or even highly paralleled computing, which can easily exceed the resources available to robots in their onboard computers [2]. The exceptional benefits of introducing computation offloading, such as enhancing processing capabilities and speeding up application execution, have achieved great success in various robotic tasks such as simultaneous localization and mapping (SLAM) [3], object recognition, and grasp planning [4]. additional communication between robots and the cloud. In this situation, the benefit of computation offloading would be counterproductive.
For example, multirobot SLAM, one of the representative robotic applications, is a compute-intensive and data-intensive task. SLAM aims to perform real-time localization and mapping "simultaneously" with a sensor (e.g., LiDAR or depth camera) moving through an unknown environment without any exogenous means of the location. Multirobot SLAM is an extension of single-robot SLAM in terms of parallel and distributed processing. As shown in Figure 1, each robot in the robot swarm processes its own localization and local map construction in an unknown environment. Then, all local maps are merged cooperatively for the global map. In the traditional setup in Figure 1(a), each robot executes the computationally intensive SLAM algorithm locally and independently, and the local maps from each robot are exchanged periodically to merge with the ever-increasing global map. The communication cost of the latter increases either when the size of the robot team increases or during attempts to improve the timeliness of the global map; this causes poor quality of service (QoS) such as high latency or even no response. In our experiments, to build a midsize indoor map with four robots, the communication among robots occupies up a bandwidth up to 40-80 Mb/s, which is unacceptable for most common robot swarms. With the traditional computation offloading method in Figure 1(b), we independently migrate the local map-building process on each individual robot to cloud servers. It means that firstly, the local sensor data needs to be sent to the cloud servers, and then, the map data needs to be transferred back to the robot from the server and finally exchanged among robots as before. Computation offloading doubles the bandwidth consumption and neutralizes the benefit of introducing computation offloading in this situation.
Therefore, we cannot simply offload the computation of each individual in a multirobot cooperative application. We argue that offloading the communication inside the robot swarm is equally vital to improving cooperative tasks' efficiency in a "cloud + robot" architecture. This reasoning is inspired by the simple idea that, because computationally intensive modules can be migrated to the cloud, the possibility exists to offload the considerable quantity of data transmission generated by robot cooperation to the cloud as well. This solution would promote cooperation efficiency by utilizing the highbandwidth network inside the cloud platform instead of the local low-bandwidth wireless network. As shown in Figure 1, by offloading the cooperation among robots to the cloud servers, not only can the localization and mapping processes be performed on the servers, but also the output maps can be transferred within the cloud to another computation module in need. The bandwidth is less likely to be overoccupied in this situation, thereby improving the cooperation efficiency in generating global maps. Another problem in computation offloading is that the offloading performance would deteriorate due to low data rates if too many mobile users choose to offload their tasks via the same wireless access channel simultaneously. QoS is one of the most important factors to consider for robotic applications because these applications interact directly with the physical world [5]. So offloading decisions and some additional mechanisms should be studied to adopt such a new approach to boost cooperation with QoS assurance in robot swarms.
In this study, to address the challenge mentioned above, we introduce the concept of cooperation offloading. Cooperation offloading is a new offloading approach for multirobot cooperative tasks, which treats the cooperative robot swarm as an entirety when offloading by taking the factor of cooperation among robots as well as computation into account. It can offload the original communication and cooperation in robot swarms instead of introducing additional cooperation for computation offloading [6], making the cooperative system more efficient. The overall goal of our work is to propose a general solution that would enable existing multirobot cooperative tasks to be executed more efficiently by using the cooperation offloading method and that would satisfy the QoS requirement of the robots at the same time.
In summary, the key contributions of this paper are as follows.
(i) We propose cooperation offloading, which offloads cooperation in robot swarms to the cloud servers. We calculate the time cost of local computing, computation offloading, and cooperation offloading by formalizing the multirobot cooperative architecture model. Then, we study the performance gain contributed by cooperation offloading and provide offloading decisions (ii) We design a cloud-based robotic framework, named Cloudroid Swarm. Cloudroid Swarm performs cooperation offloading in addition to computation offloading, utilizing the high-speed network infrastructure in the cloud. To assure QoS in Cloudroid Swarm, we propose a distributed link detection algorithm at the local level and link capacity adjustment at the global level to adapt to the poor and dynamic network environment (iii) We implement Cloudroid Swarm to support cooperation offloading for multirobot applications and transparently migrate multirobot tasks to the cloud servers without any code modification. We also propose several effective mechanisms to improve the QoS and scalability of Cloudroid Swarm (iv) We investigate the performance of the proposed QoS-aware cooperation offloading framework by evaluating three different multirobot applications in both simulation and real-world environments. The applications include cooperative SLAM, multirobot exploration, and multirobot collision avoidance. The results demonstrate the efficiency of Cloudroid Swarm in terms of network optimization, task performance, and scalability This article is an extension of our previous conference version [7], which presents a cloud robotic framework that boosts the efficiency of cooperation for multiple robot applications in robot swarms and evaluates the framework using both the public data sets and simulator. However, it is observed that 2 Wireless Communications and Mobile Computing the framework cannot provide physical robots' satisfactory behavior due to many interference factors in the real world, which has unstable input and network environment. Thus, QoS property is required, especially in a highly dynamic and resource-competitive environment. To overcome this difficulty left in the previous work, we propose several QoSaware algorithms and mechanisms to make our framework more robust. Thus, we can obtain the performance gain even in the real-world mobile robot system. Besides, to demonstrate the proposed QoS-ware framework's excellent performance, we evaluate our framework in extensive experiments, including quantitative analysis of QoS's benefits in the cooperative SLAM application and collision avoidance test conducted on real-world robots.

Cloud Computing in Robotics.
In the domain of cloud computing, computation offloading is regarded as an effective way to alleviate the constrained resources on mobile phones and Internet of Things (IoT) devices as well as reduc-ing the running cost in mobile cloud computing (MCC) [8] and mobile edge computing (MEC) [9]. In the robotic field, because of the similarity of the computational patterns to those in mobile computing, computation offloading is also demanded by robot tasks. A series of recent research studies have been dedicated to boosting robotic applications. Chen et al. propose the term "Robot as a Service" (RaaS) and present a self-contained unit in the cloud computing environment [10]. However, the development and deployment of robot applications are limited to a certain programming language and architecture (Intel), without the ability to migrate the existing robot software to the cloud. Seminal work in this field is DAvinCi [11], a particle-based SLAM framework for service robots in a large-scale Internet environment. However, it requires the entire process running in the cloud to be deployed and configured manually. Another closely related study is Rapyuta [12], a framework that enables robots to offload their complex computation to the cloud. Rapyuta is a typical clonebased PaaS architecture based on the Linux Container (LXC).
To solve Rapyuta's limitations of dynamical deployment for complex tasks and the lack of cloud management tools, our previous work, Cloudroid [13], supports the automatic  3 Wireless Communications and Mobile Computing deployment of existing robotic software packages to the cloud, thus transparently transforming them into Internet-accessible cloud services with QoS assurance. However, both Rapyuta and Cloudroid only focus on one individual robot for computation offloading and our evaluation in Section 6 shows that they are not suitable for a cooperative multiple robot applications.

Multi-user Computation
Offloading. Concerning multiuser computation offloading, one of the major topics to be investigated is the decision of whether mobile users offload their computation task to the cloud or not. There is offloading transmission competition among users because several users may choose the same wireless access channel and offload tasks to the cloud simultaneously.
Some studies adopt centralized approaches [14,15], which update the offloading decision iteratively to solve joint task offloading and resource allocation in MEC networks. Chen [16] demonstrates that it is NP-hard to optimal multiuser computation offloading solutions in a multichannel wireless interference environment, and hence proposes a game-theoretic approach for achieving efficient computation offloading in a distributed manner. Considering social and behavioral characteristics of users in the overall computation offloading process, Apostolopoulos et al. [17] exploit prospect theory instead to account for users' risk-seeking and loss-aversion behavior in offloading decisions. All the aforementioned methods are limited by the trade-off between optimality and computational complexity.
Deep learning shows excellent potential in the field of wireless communications to deal with multiuser task offloading decisions [18]. Wu et al. [19] propose a distributed deep learning-driven task offloading for collaborate edge and cloud computing, where multiple parallel DNNs are used to generate offloading decisions. Then the offloading decision with the lowest system utility is chosen as the output and the label to train deep neural networks. To characterize long-term computation offloading performance, Dinh et al. [20] propose a distributed model-free reinforcement learning offloading mechanism, which reaches 87.87% payoffs compared to the optimal condition. Since security is one of the critical issues in mobile edge computing and mobile edge computing, Huang et al. [21] propose a security and costaware computation offloading strategy based on the popular deep reinforcement learning approach, deep Q-network. These distributed deep learning methods assume that the mobile device has sufficient computing capability to compute and obtain the offloading decision in real time. However, mobile robots usually have limited computing and communicating capabilities to satisfy this assumption.
Besides, none of these studies consider the cooperative tasks that need data-intensive communication among users, which are common in robot swarms. What is more, all of their evaluations are carried out in a simulation environment. As we know, real-world experiments would be affected by more interference factors and omnipresent uncertainty, thus requiring stricter QoS requirements. In this paper, we propose a link capacity adjustment algorithm to ensure QoS of multiuser computation offloading, which is proven to be effective in real-world multirobot resource competition environment.

2.3.
Multi-robot Cooperation. The architecture of multirobot cooperative applications has been studied for years. One of the crucial research topics is the communication model, which indicates the data exchange pattern used in multirobot systems [22]. It is demonstrated that the minimum amount of network consumption in a swarm of n robots where there is communication between any two robots could be Oðn 1:5 Þ, which is not linearly scalable when the number of entities increases [23]. Cloud-based studies are carried out to avoid this limitation. In [24], robots are grouped into different clusters. Communication between different clusters is promoted via the cloud, whereas the direct local transmission method is used for internal communication within each cluster. However, because of the mobility of robots, maintaining the cluster division is complicated, and it is also difficult to determine the boundary between local and cloud methods. A cloud-based research using multiple low-cost robots is proposed in [3]. This approach leverages the Rapyuta [12] robot framework, and all the data exchange occurs in the cloud. However, it is a task-specific solution that can only be applied to a specific 3D mapping task. Chen et al. propose a framework of robotic cooperation on computation offloading to enable both robot-robot and robot-cloud cooperation [25]. By implementing a method that is different from all the abovementioned solutions, we utilize advances both in the local and the cloud computing environments to complete different kinds of cooperative tasks of robots. Identifying and optimizing communication bottlenecks exploit the large bandwidth inside the cloud while maintaining the benefit of flexibility in the original local network environment.

Cooperation Offloading Decision
3.1. Multi-robot Cooperative Models. Because our method targets decentralized multirobot applications, we utilize the widely used Publish-Subscribe model, which is also the main messaging pattern in Robot Operating system (ROS) (https://www.ros .org/), as the foundation of our formalization. A task is assigned to different processing units inside robots, in the form of processes running on the onboard computer system. We denote each process as an operator. As shown in Figure 2(a), we classify the operators into two categories. The nonmigratable operators directly interact with the physical world, e.g., laser scanner, camera reader, and the velocity controller of moving wheels. And the migratable operators do not directly interact with the peripheral device, typically performing intensive computing. The communication pattern between operators is based on topic. One operator can publish messages on a specific topic, whereas all other operators that subscribe to this topic will receive the message. When the case comes to computation offloading, the migratable operators are safely transferred to the cloud for computing acceleration and are wrapped into a computation module. The local robot node and computation module are connected with the dedicated channel for data exchange. Though publishers and subscribers are at the two ends, the channel has the ability to forward the message across 4 Wireless Communications and Mobile Computing the network between the two sides transparently. As the illumination in Figure 2(b), offloading is achieved by deployment configuration with keeping source code unmodified. As shown in Figure 3, we model decentralized multirobot architectures, where n robots perform the cooperative task. Figure 3(b) depicts a multirobot computation offloading scenario. The migratable operators are safely transferred to the cloud (we do not explicitly differentiate between edge servers and remote cloud servers because edge servers can be modeled as cloud servers with lower latency but thinner computing resources) for enhancement and are wrapped into a single computation module. The local robot node and computation module are connected with the dedicated channel for data exchange. Note that data for cooperation in the robot swarm still need to be exchanged in local data links. Figure 3(c) illustrates our proposed cooperation offloading model, which adds shortcut links between computational modules. These robots make a graph with n nodes, where the node for the ith robot is N i . From every node N i to another N j , there exists a local data link, which is denoted as NL ij . In the case of computation offloading for a multirobot arrangement, the cloud side computation module of N i is denoted as CM i , and the channel connecting N i and CM i is CH i . In the case of cooperation offloading, the additional cloud link between CM i and CM j is CL ij .

Time Cost for Offloading Decisions.
Besides the characteristics of computation offloading, cooperation offloading has its unique features, such as high communication costs among robots as well as between robots and servers, which bring new challenges to offloading decisions. To simplify the estimation without loss of generality, we consider the following particular situation: (i) There exists only one topic in the robot swarm     (iv) For a robot swarm, either all tasks are performed locally or all of them are offloaded to servers. We propose this assumption because cooperation offloading considers cooperative tasks as an entirety First, we consider the local situation in Figure 3(a). Let s m be the computing speed of mobile robots and s s be the computing speed of servers. Each robot i needs to perform part of a cooperative computationally intensive task denoted by where d i is the data size of the migratable part of T i , h i is the number of CPU cycles, and x i is the amount of data exchange with other robots for the robot i to complete the entire task. Suppose all robots have the same communication ability and R NL is the communication rate between local robots. Thus, the computing time for the T i of local computing is h i /s m and the cooperative time is ðn × x i Þ/R NL . And the total time cost of local computing for robot i can be expressed as In Figure 3(b), let R CH i be the channel's communication rate between robot i and its computation module in the cloud side. If we simply apply computation offloading to each robot independently, although the computing time is reduced to h i /s s , the cooperative time is the same as before, and the additional transmission delay between robot i and the cloud is Note that we consider both the upload and download delays here. Then, the total time cost of multirobot computation offloading is Let d max be the maximum value of {d 1 , d 2 , ⋯, d n }, and let the other variables have similar maximum definitions. Taking a global view of the entire task, the time cost depends on the last robot to complete the execution. Without loss of generality, we assume that robot i entails processing the largest amount of data and requires the most data to exchange and is thus the most time-consuming. Thus, traditional computation offloading for robot swarm improves performance when t local computing max > t computation offloading max We argue that (3) is difficult to satisfy in most situations, especially in the event of intensive communication between robots (large x i ), and many robots compete for the offloading resource in a poor network environment (small or unstable B CH i ). Considering the situation that h i /s m < ðd i + x i Þ/R CH i , even if the processing speed of the server is infinitely large (i.e., s s → +∞), computation offloading results in counterproductive performance. With cooperation offloading, the use of communication shortcuts in the cloud side in Figure 3(c) would enable messages to be exchanged directly via the cloud links. The communication speed inside the cloud is extremely high, such that the cooperative time inside the cloud can be ignored. Furthermore, it is unnecessary to send the data for communication x i back to the local side for cooperation. So we can ignore the download delay here. Thus, the total time cost of multirobot cooperation offloading is Compared to (2), we find that cooperation offloading can reduce the time cost by reducing the communication data among robots in our scenario. The cooperation offloading decisions can be made by t local computing max > t cooperation offloading max We can define the difference between the two sides of the inequality as the performance gain. Then, we can learn that cooperation offloading achieves more performance gain than computation offloading under the same conditions in multiuser offloading scenes from (3) and (5). On the other hand, we can monitor B CH i in real time using the method proposed in Section 4.1. When the condition of (5) holds, which is more easily established than (3), it is a sensible option to use cooperation offloading to improve the system performance (line 2 in Algorithm 2).

QoS-Aware Framework
To enable the cooperation offloading, we propose a framework named Cloudroid Swarm, which leverages the network environment inside the cloud to support communication between computation modules. Built on the foundation of Cloudroid [13], Cloudroid Swarm still exploits the splitting computation module for each robot. The new components designed for each robot are the network module in the cloud side and the network operator inside local robots. For the entire application, the main improvement is the establishment of the topology engine, the control plane for task-wide cooperation, as shown in Figure 4.
Network module: for each robot participating in the application, a network module termed NM i runs on the cloud side and is launched along with the corresponding computation module CM i . Its primary responsibility is to handle communication resulting from cooperative multirobot computation offloading. NM i , in place of CM i , is directly 6 Wireless Communications and Mobile Computing connected with channel CH i to intercept all the network messages from/to it. In addition, because of its network awareness, NM i can sense other network modules and send messages directly to them via cloud links inside the cloud. Network operator: in Cloudroid Swarm, communication on the robot side is handled by the network operator, with a similar function to the network module in the cloud side.
This kind of operator, which is directly connected with the channel and node links, acts as the bridge between the other operators inside the robot and the outside world. Similar to nonmigratable operators, network operators are also processes in robot nodes. But network operators have special characters that communicate with other cloud components, including topology engines and other modules in the cloud. Emit a new message from the source end to the destination end 3: for eachmsg returned at ½t − T, tÞdo 4: f lying time ðtÞ ⇐ f lying time ðtÞ + RRTðmsgÞ 6: end for 7: for eachmsg returned at ½t − W − T, t − WÞdo 8: f lying time ðtÞ ⇐ f lying time ðtÞ − RRTðmsgÞ 10: end for 11: lat ðtÞ ⇐ f lying time ðtÞ /sizeof ðecho i Þ 13: t ⇐ t + T 14: end while Algorithm 1: Sliding-window algorithm.
Input: Collection of robots using the same wireless access to offload:

Wireless Communications and Mobile Computing
Topology engine: the topology engine acts as the coordinator on the cloud side, processing the global topological information of the entire task. It is designed to be started along with the launch of the application routine and maintain interaction between the network modules and network operators. With the real-time metrics of links, the global topology engine performs the global planning method to control the capacity, which can control the message flow in dynamic equilibrium.
A notable challenge in cooperation offloading is the uncertainties it introduces, which influence the specific QoS properties of the multirobot applications. Therefore, the architecture must be designed to include real-time monitoring of the capacity of links and global network planning to manage the behavior of the computation modules for improved performance. Based on the components mentioned above, we design a set of QoS awareness mechanisms on the robot, cloud, and network topology to minimize the impact of uncertainties caused by poor network conditions or resource competition. Note that the client-side and cloud-side QoS mechanisms are described in detail in our previous paper [13], and in the current paper, we mainly introduce the key mechanisms of the network side. Scheduling and optimization of communication at the scale of the application can hardly drive the optimal solution while maintaining real-time performance because these functions are involved in the interaction with dynamic workload and network conditions. However, with deep insight into this distributed optimization problem, we can split the entire optimization approach into two different levels: L , which represents the capacity of the link. These variables are measured continuously during the entire task. It should be pointed out that it is difficult to obtain the exact value at time t, and the average from t − T to t is used alternatively in most cases.
Because the wireless network conditions can change dramatically with the movement of the robot, the bandwidth and lat ðtÞ L of channels vary from time to time. Estimating these time-related variables is essential for planning the route of messages to improve the performance. Though it is impossible for a central coordinator to sense the message flow in each data link in exact real time, it is still feasible to periodically acquire the communication metric at a suitable time interval with the collaboration of network operators on node N i and network module NM i . In our scenario, we are only concerned with the end-to-end parameters of the data links. Therefore, we apply a sliding-window method for detection, which has a negligible impact on the network.
As shown in Figure 5, at the beginning of each time interval T, the source end emits a message with a fixed size to the other end of the link. Upon receiving this message, the destination end echoes the same message back to the source. The round trip time is recorded by the source module as RRTðmsgÞ. Simultaneously, a sliding window with a fixed length of W is maintained, which spans from t − W to t. We can count the

4.2.
Global Link Capacity Adjustment. The added network components enable direct data exchange inside the cloud with the sacrifice of increasing the traffic in the channels between robots and servers. The latter is more likely to become a bottleneck because of constrained bandwidth and competition among robots that choose the same wireless access route to offload. So we need to carefully adjust their capacity to guide the flow and prevent the bottleneck from occurring. We design the global capacity assignment policy deployed at the topology engine and dynamically adjust the cap ðtÞ L of channels in each timestamp, thereby scheduling message flows. The policy is based on the following principles: (i) When a new operator is launched at one end of a link, a larger number of messages are transferred via this link. In this case, the capacity of this link tends to increase (ii) The source module sends a large number of messages in a short period, and then, it recovers to the previous state. We argue that these conditions indicate an emergency signal arising from the source module, for example, the obstacle encountered by a wheeled robot or passages in front of an autonomous vehicle. This situation should be quickly reflected in the capacity (iii) The number of messages on one or more existing topics is increased. In this situation, the capacity of the relational data links should be increased for long-term governance Based on the above analysis, we design a QoS-aware distributed link capacity adjustment algorithm, which is presented in Algorithm 2. For each task, we choose the cooperation offloading strategy by default. Then, we detect link status using the sliding-window method described in Section 4.1. If bandwidth is too low to satisfy the cooperation offload-ing decision, the system will choose the local computing strategy to guarantee the basic QoS. Otherwise, we choose the cooperation offloading to boost cooperation and adjust the capacity of the channels between robots and the cloud. The topology engine executes the policy routine proposed above and sends the results back to the local component for assigning capacity in the next timestamp (from line 11 to line 15 in Algorithm 2).

Computational Complexity.
We provide the theoretical analysis of the computational complexity of the proposed algorithm. Algorithm 1, the sliding-window algorithm, is distributed on each mobile robot. So the computational complexity of the sliding-window algorithm does not scale with the group size, n (i.e., Oð1Þ complexity).
Algorithm 2 has two main steps, which are cooperation decision making and link capacity adjustment. Since we assume that all tasks are either performed locally or offloaded to servers for a robot swarm, we can obtain the optimal cooperation offloading decisions by comparing the maximum total time cost of local computing and cooperation offloading. Then, we adjust the capacity of channels between robots and the cloud if using cooperation offloading. The computational complexity of both steps increases at a linear rate OðnÞ of the group size. So the computational complexity of the QoS-aware link capacity adjustment algorithm is OðnÞ.
Overall, the computational complexity increases linearly with the increasing number of robots, making our platform have good scalability.

Implementation
We implement the prototype of Cloudroid Swarm based on the ROS programming model to adapt to existing multirobot applications. Cloudroid Swarm is an extension of the singlerobot-oriented framework, Cloudroid, which can transparently migrate computation-intensive modules to the cloud servers and wrap them as computation modules for enhancement. While exploiting the existing systems, we still devise significant mechanisms to improve the QoS and scalability of Cloudroid Swarm. These improvements include the following: Topic remapping: to intercept messages from/to computation modules, we utilize the building function of the ROS launching mechanism to remap all the originally subscribed/published topics to new ones. In this situation, the network modules, which are related to both the original and remapped topics, can successfully manipulate and forward the messages that are transferred between the computational module and other components.
Container cluster orchestration: because all modules, including the computation modules, network modules, network operators, and topology engine are self-contained Docker instances, the coordination and management of them are essential for Cloudroid Swarm. In this regard, we adopt a popular open-source container cluster orchestration tool known as Kubernetes (KubeEdge: https://kubeedge.io/). It enables load balance, deployment replication, and elastic consolidations. In addition, we also use KubeEdge (KubeEdge: https://kubeedge .io/) for edge management if our system contains edge servers.  Figure 5: The sliding-window method for link detection.

Wireless Communications and Mobile Computing
Message deduplication: because the ROS programming model commonly uses a topic-based communication pattern, the publisher must send identical message data to each subscriber in local native setups, for the reason that ROS network stack has no knowledge of the underlying topology. In Cloudroid Swarm, the approach of message deduplication is applied to handle inefficiencies of this nature. When node N i sends messages on a topic to which the other computation modules subscribe, only one copy of the data needs to be transferred via channel CH i . All the topics published by N i are delegated by NM i on the cloud side, and when NM i receives the message, it takes the stored subscriber list shared by N i to forward the message to the other computation modules. Situations in which a message is sent from the network module to other local robots are processed similarly.
Optional message pull: the ROS message model defines that as long as there exists a subscriber of a topic, the publisher must send every message whether the subscriber uses it or not. Under the split model of Cloudroid, this situation will press a large impact on the network bandwidth. To solve this problem, we allow the user to define the optional argument of each message to define whether publishers use "pull" or "push" mode to send messages when robots upload the application to Cloudroid Swarm. The push mode is the default behavior of the ROS model, and on the other hand, the pull mode enables the on-demand sending of the message only at the time when the subscriber request this message. We design the pull mode as optional because it may cause more complicated behavior in a real-time consideration of some messages. However, in the performance evaluation of our system, we have noticed a significant improvement in network bandwidth and QoS under the on-demand pull mode.
Time sequence message elimination: another essential optimization point targets the time-efficient topics. Receiving the latest message is vital for a subscriber to maintain realtime performance. Instead of sending each message using the default FIFO (first in first out) behavior of ROS, for this kind of topic, we optimize it by always publishing the latest message over the network. Other previous historical messages have been of little use and can be safely eliminated for increased network efficiency.
Custom compress transport: message compression is enabled to reduce the network footprint, and we choose Google Protobuf (Google Protobuf serialization: https://developers .google.com/protocol-buffers/) to (de)serialize messages. For communication via cloud links (if more than one server is configured), we exploit the ZeroMQ (ZeroMQ: http://zeromq .org/) distributed messaging system, which is more friendly to the cloud environment and provides significant efficiency.
We reuse the infrastructure of ROS and Cloudroid to implement certain components of Cloudroid Swarm. Our framework is designed at the platform level and is transparent to the overlying robot application, which allows the original application to be safely migrated to the cloud without the need for any code modification. In addition, the code for the topology engine is approximately 1,100 lines of code, whereas another 1,400 lines of code are devoted to the network module and network operator.

Evaluation
This section presents our evaluation of the performance of Cloudroid Swarm with three different types of multirobot applications, whose inputs are from the public data set, simulation environment, and real-world turtlebot system, respectively. These representative tasks are multirobot SLAM, collision avoidance, and exploration, all of which call for cooperation between robots and involve a large amount of data exchange among robots. The evaluation of each application also includes an experiment we conduct on other offloading or local native configurations for comparison. To the best of our knowledge, work that focuses on cooperation offloading has not yet been reported. Thus, during the evaluation, we compare our work with the following baselines.
(i) Local native: without any assistance from the cloud, all the computation and cooperation occur locally     Wireless Communications and Mobile Computing on the robots. This is also the target environment for the design of the three applications (ii) Cloudroid: Cloudroid is a general framework for computation offloading. Certain computationally intensive tasks are configured to be migrated to the cloud side for enhancement (iii) Rapyuta: although the architecture of Rapyuta supports computation offloading, similar to Cloudroid, Rapyuta is consolidated with more cloud-based techniques such as a load balancer to provide more flexible control for developers (iv) Cloudroid Swarm without QoS: Cloudroid Swarm is our framework designed for multirobot cooperation offloading with network optimization for QoS. We set this baseline without a QoS mechanism, such as link detection and global link capacity adjustment for ablation studies We deploy an outstanding commercial public cloud with four computation hosts as the testbed for all cloud-based setups on the cloud side. Each host is configured with a four Intel Xeon E5-2682 CPU, 16 GB RAM, and hosts are interconnected with 1 Gbps Ethernet. On the robot side, the physical platform for the four robots is the wheel-driven robot TurtleBot3 (Turtlebot: http://www.turtlebot.com/), which is equipped with LiDAR for laser scanning. The onboard robot processing computer used is Raspberry Pi 3 Model B (Raspberry Pi 3 Model B: https://www .raspberrypi.org/products/raspberry-pi-3-model-b/), with a CPU containing four cores, 1 GB RAM, and BCM43438 wireless LAN (802.11b/g/n standard with up to 72.2 Mbps net throughput).
6.1. Evaluation Case 1: Cooperative SLAM. This section describes our evaluation of the efficiency of Cloudroid Swarm on CG_MRSLAM [26], a ROS-based framework designed to enable multiple robots to participate in a cooperative SLAM process. Each robot continuously and incrementally sends the map built by itself to another robot nearby using peerto-peer communication during the task. When other robots receive these local maps, they integrate them into their own maintained map to build the global one. From the perspective of message flow, the architecture of CG_MRSLAM is depicted in Figure 6, where the area enclosed within the blue line can be migrated to the cloud platform for enhancement in our setup of cloud offloading. Operators that are intensive in terms of computation and communication, such as localization and mapping, can be migrated to the cloud platform for increased execution efficiency in our cloud offloading setup. The robots only process the laser scan and odometry inputs related to the hardware. Unlike the original local setup, where a message can only be sent when two robots are sufficiently close in proximity, our cooperation offloading method eliminates the distance limitation.
To compare the influence of communication on the final accuracy of the task, we choose to conduct the experiments using public sensor and actuator data [27], which was captured by the Technical University of Munich using the Pio-neer robot (Pioneer P3-DX: https://www.generationrobots .com/en/402395-robot-mobile-pioneer-3-dx.html). The four data sequences in Table 1 were collected in the same indoor scene, and we apply each one separately to our robots as the simulation input data.
(1) Communication performance To evaluate the network optimization efficiency of the QoS-aware link capacity adjustment algorithm proposed in Section 4.2, we introduce an additional cooperation offloading setup without QoS algorithms and mechanisms for baseline in this section; the link capacity of the channel is equally shared among users.
To demonstrate our framework's ability to relieve the pressure of communication, we also investigate the bandwidth usage during the task in Figure 7(a). The results show that for three representative links (CH 1 , CH 2 , and NL 12 ), the usage of links in Cloudroid Swarm is the lowest, especially for the channels between robots and the cloud, which are more easily to be bottlenecks. In particular, CH 2 exhibited a 57% decrease in bandwidth usage compared with Cloudroid, whereas for CH 1 , the decrease exceeded 80%. It is also observed that for all links, adding QoS mechanisms increases the bandwidth occupancy slightly. This is because, except for data exchange, our sliding-window algorithm for link detection also takes up bandwidth.
During the evaluation, we record the latency for each map message data and the total number of messages transferred   Figure 7(b) From the number of messages depicted by the black line, we learn that the number of messages increases significantly with Cloudroid Swarm, compared with the other three setups. More messages exchanged indicates more cooperation among robots. With direct message transmission between network modules, the latency with our framework is also largely reduced and becomes more stable; the variance is the smallest of all scenes. It is observed that when we introduce the QoS-aware link capacity adjustment algorithm, Cloudroid Swarm obtains the lowest average and maximum message traveling time and the highest number of messages, indicating the best network optimization performance. This is because our framework will choose the local computing strategy when the message traveling time of cooperation offloading is longer than the message traveling time of local computing. The QoS mechanisms guarantee the task performance under poor or dynamic network environments. Note that without enhancement using cooperation offloading, our original framework, Cloudroid, has the most unstable network performance with a large variance in message latency.
(2) Task accuracy performance We also conduct the task accuracy evaluation using the ground-truth trajectory data provided, inspecting and discussing the trajectory we generated from the CG_MRSLAM. Our results with the data set named freiburg2_pioneer_slam3 for all four setups are depicted in Figure 8. The ground-truth trajectory is also shown for comparison, and the red line represents the transitional error. For tracking precision, we use ATE, a metric defined in [27], to describe the difference between the ground-truth and the estimated trajectory.
ATE is calculated using the least-squares method to find a rigid-body transformation T, which maps the estimated trajectory E n onto the ground truth G n . Then, the root-meansquare error over all time indices of the transformation components is evaluated using the following expression: Comparing the red accumulated error and ATE value, we learn that in Figure 8, especially in subfigures (b) and (c), because parts of the condensed map cannot be transferred smoothly between robots with limited bandwidth, the localization phase of CG_MRSLAM easily drifts from the ground truth. This phenomenon causes a considerable increase of the ATE. We find the ATE of Cloudroid and Rapyuta to be 2.24 and 1.28 times higher, respectively, than the local native setup. This indicates that with computation offloading, the message overload is so high that it makes the performance even worse. These results also correspond with the analysis shown in Section 3.2. With our Cloudroid Swarm, the ATE decreases to 0.11, which nearly doubles the performance of the local native. Thus, cooperation offloading instead of only computation offloading is adaptive to this task.
6.2. Evaluation Case 2: Multi-robot Exploration. Multirobot exploration, a task that collaboratively explores the frontier of an unknown environment by a robot group, is also evaluated by our environment. During this task, the information about the frontier and border is transmitted to other robots to negotiate the explored areas.
The application suite we leverage is the collaborative exploration framework proposed by Alpen-Adria-Universität Klagenfurt [28], abbreviated as AAU. Different robots individually conduct frontier exploration, which sends the local map and robot location to all the other robots for merging into a global map. The architecture of AAU has many similarities with the architecture of CG_MRSLAM, and both use laser scans and odometry to sense the environment. Although both CG_MRSLAM and AAU broadcast the local map to peers, AAU chooses the entire local map whereas CG_MRSLAM only sends the condensed map. However, because exploration is an application that requires the robot to be able to move freely, the timeliness of the interrobot message is a dominating

13
Wireless Communications and Mobile Computing factor in the accuracy of map merging, which directly becomes an important factor in the speed and accuracy of exploration. In order to evaluate the performances of the multirobot system quantitatively, we use a ROS stage simulator similar to [28]. In addition, AAU uses another communication mechanism for its local architecture; specifically, robots that are not sufficiently close to each other but want to exchange data can use the third robot as a relaying router for message forwarding. Although this would be an effective optimization approach for local communication, in the setup of Cloudroid Swarm, it becomes unnecessary and can even increase the latency. To accommodate this situation, we configured the internal ad hoc communication of each node to be migrated to the cloud side such that it is transferred directly inside the cloud to boost performance.
The performance of multirobot exploration tasks can be measured as the time used to expand the entire area of an explored place. The communication efficiency has a strong influence on the overlapping area, which in turn affects the coverage speed of the entire robot group. Based on this fact, we measure the total exploration progress of the four robots during the task, as shown in Figure 9. To eliminate the effect of differences in the size of the total area, we normalize the size as a ratio of the total size. In the initial phase of the task, the size of the overlapping area between robots is very small, and the progress in this phase increases rapidly. However, when the overlapping begins to increase, the message transmission path optimized by Cloudroid Swarm begins to increase and becomes advantageous relative to the other three setups. Note that although the additional communication path suppresses data exchange, it still outperforms the local native setup because the computation is offloaded in Rapyuta; i.e., inequality (3) can be satisfied in this configuration. Additionally, the capability of our method on AAU is also demonstrated in the exploration overlapping in Table 2, where Cloudroid Swarm has the smallest value. This is because the transmission of messages between robots is more efficient so that 5% overlapping is enough to match and joint the local maps of every robot to construct a global map. of Cloudroid Swarm on a multirobot application with more complex architecture with real-world robots. This task requires each robot in the group to navigate to the specified target position while avoiding the obstacles in the maps and their robot peers. The application we use in the experiment is Collvoid [29], which is a multirobot collision avoidance system based on the velocity obstacle paradigm [30]. The architecture of Collvoid is based on the original local wireless network, and the pipeline of its algorithm is as follows: (i) Each robot receives information about the odometry of the wheels and uses a laser scanner for a more precise localization procedure (ii) The position information is then broadcast over the wireless network, and the peers receive these messages and integrate them to detect both the obstacles and their peers (iii) Based on self-estimated localization and obstacle detection, the robot performs motion planning for further navigation and guides the wheels for movement with velocity commands The modules and message flows are shown in Figure 10. From the perspective of computation, the most computationally intensive operator is AMCL localization. When this operator is offloaded to the cloud for acceleration using computation offloading in Cloudroid and Rapyuta, the localization messages are shared locally using the wireless network, whereas in Cloudroid Swarm, it occurs in the cloud links. The real-world experiment is conducted in a closed indoor environment. Depicted in Figure 11, the four robots form a square, and each one navigated to the position occupied by its peer diagonally across while running Collvoid to avoid colliding with its peers. The inputs are directly from the turtlebots' laser scanning, containing more noise. So more efficient communication is required to complete the cooperative task.
In the evaluation, two crucial metrics need to be evaluated to compare the performance of Collvoid. The first is to determine whether robots can complete the traveling tasks and reach the specified target location, and the other is the smoothness of the traveled trajectory. A smoother trajectory indicates more flexible control of the robot. As we observe in Figure 12, all four robots can reach their specified target in Figure 12(d). On the other hand, in the other three setups, because the message transmission is too inefficient to carry precise information of obstacles and locations, one or more of the robots failed to reach their target. However, the various obstacle messages lower the yellow robot's smoothness relative to others, especially using our framework. This is because with cooperation offloading, robots receive more messages, thus behaving more conservatively to avoid collisions. Since we do not focus on the algorithm itself, this result is in line with our expectations.

Scalability Evaluation.
In this section, we evaluate our framework with a large group of robots to show the scalability of Cloudroid Swarm. However, the scalability validation of most existing multirobot applications, such as multirobot SLAM, is still limited to a small group of robots, which is hardly directly scaled to dozens of robots. We improve ORB-SLAM [31] for better scalability and propose a scalable and real-time multirobot visual SLAM framework [32]. This framework can effectively divide and schedule SLAM task inside a cluster, with the group-based parallelism and the map point multicut algorithm. The framework adopts a switchable messaging pattern to meet different transmission scenarios to reduce the data sharing latency between different hosts. And the map data consistency is improved by the designed linage feedback and timestamp versioning mechanisms.

Wireless Communications and Mobile Computing
Because we do not have the condition for the experiment to exploit hundreds of real-world robots for evaluation, during the simulation evaluation, we choose Docker container to emulate the physical robots instead. Each container is configured as one Xeon E5-2682 CPU and 2 GB RAM, and the network bandwidth is limited to 50 Mbps, which is the ceiling rate of the wireless and 4G cellular network devices on robots. The used public data set is the same as in Section 6.1. In order to adapt the data set to large-scale robot swarms, we divide each image data sequence into multiple pieces, making every segment have a length of 20 s.
Since our method is fully distributed, the scalability of it is demonstrated from two aspects, the number of robots which our framework can support for performing cooperative SLAM in real-time and the number of computation hosts in the cloud side our method can extend to. In this situation, we choose the number of computation hosts varied from 1, 2, 4, 8, to 16, and the different numbers of simulated front-end robots ranged from 16, 32, 64, 128, to 256, to deep insight into the effects on each combination.
The results are depicted in Figure 13, where the average metrics, including FPS, data transmission rates, and group sizes, are shown to characterize the performance of our method. Although our method also encounters FPS decreasing when the number of robots is increasing, FPS also has to get promotions when the number of hosts increases. Especially in the case of 16 hosts, even for 256 robots, it has retained the rate to more than 20 FPS, which is enough for the real-time requirements of SLAM applications. The data transmission in the most remarkable case is 188 Mbps (256 robots in 16 hosts), which is exceeding the local communication capabilities of the mobile robot, but still much less than 1 Gbps network bandwidth in the cloud side. With more hosts (from 4 to 8, 16) deployed in the cloud side, the data transmission does not show a significant increase, indicating that our framework can be effectively scaled out in the cluster.

Discussion
Our approach has some limitations. Considering the limited computing capabilities and real-time requirements of mobile robots, we simplify the cooperation offloading problem by assuming that either all tasks are performed locally or all of them are offloaded to servers. However, it is common that each robot can determine whether to offload or not in the computation offloading scenes. We believe that deep learning and reinforcement learning will play important roles in generating cooperation offloading decisions for each individual.
Security is one of the critical issues in mobile cloud computing [33] and mobile edge computing [21]. Since our framework is based on ROS, every user connecting to the ROS master could leak sensitive information (such as data from sensors or cameras) or even send commands to move robots, creating privacy and safety risk. The problem becomes serious if we extend ROS to the public Internet. To alleviate this problem, we restrict only authorized users to access the platform. Some more advanced encryption algorithms should be introduced to safeguard the robots during the cooperation offload-ing process to deal with security threats (e.g., snooping and alteration).
MEC servers are much closer to mobile devices and thus have lower latency, while MCC servers can provide flexible and scalable computing capability to support complicated applications [34]. For simplicity, we do not explicitly differentiate between edge servers and remote cloud servers in our formulations. However, the distinguishing characteristics of edge computing include its dense geographical distribution, support for mobility, and proximity to end users [35]. Loghin et al. [36] demonstrate that MEC is more effective than MCC when the task has a higher input-to-output ratio and lower computation-to-communication ratio for uploading and processing the input on the cloud. Though we conduct our experiments with MCC until now, we believe it is easy to extend cooperation offloading to MEC according to the situation.

Conclusion
In this paper, we study the cloud-based offloading problem in multirobot cooperative scenes and propose an approach named cooperation offloading for robot swarms performing a cooperative task. We analyze the time cost and then propose offloading decisions by formalizing a general model for this problem. To apply this concept in a practical situation, we propose a set of network components and develop an algorithm on both the local and global levels to optimize the network links. Next, we implement Cloudroid Swarm and use three representative multirobot applications to validate the framework in a constrained network environment. The results show the efficiency of our approach, which enhances the communication performance more than twice and the task performance more than four times compared to the setup without offloading or with well-known computation offloading frameworks. Finally, we verify the feasibility of our framework in the realworld environment and scalability with hundred-level robot swarms.

Data Availability
The input public data set used by four robots in case 1, Cooperative SLAM, is available at https://vision.in.tum.de/data/ datasets/rgbd-dataset/download.

Conflicts of Interest
The authors declare that they have no conflicts of interest.