Behavior Trees for Smart Robots Practical Guidelines for Robot Software Development

Behavior Trees are a promising approach to model the autonomous behaviour of robots in dynamic environments. Behavior Trees represent action selection decisions as a tree of decision nodes. The hierarchy of these decision nodes provides the planning of actions of the robot including its reactions on exceptions. Behavior Trees enable flexible planning and replanning of robot behavior while supporting better maintainable decision-making than traditional Finite State Machines. This paper presents an overview of lessons, which we have learned when applying Behavior Trees to various autonomous robots. We present these lessons as a sequence of steps that is meant to support robot software practitioners to develop their systems.


Introduction: Need for Embodied Intelligence
Robots are designed to get a particular task done in the real world such as manoeuvre along shop oors in warehouses, or to support caring sta in hospitals. ey are made to perceive the world using speci c sensors, such as camera's, laser scanners, pressure sensors, and to act in this world using speci c actuators like grippers, arms, and wheels [1]. In general, an autonomous robot has a control unit that takes sensor data as input, builds a model of the world, and selects appropriate actions for the situation it thinks it is in. To act in its world the robot has a set of built-in skills [2], such as the ability to move around and to pick and place objects. (see Figures1 and 2) e decision-making intelligence of an autonomous robot is in its action selection mechanism that determines which skill to activate for which state of the world. e action selection mechanism can be seen as the centre of the robot brains. It determines its behavior and with that its usefulness.
Designing a robust and maintainable action selection policy is not an easy task. Not only the primary task of the robot needs to be encoded as a sequence of actions but also all possible real-world states must also be taken into account. In general, a policy will comprise a combination of a deliberative plan of actions and a set of actions that are triggered only when the state of the world requires it.
As the real world is too complex to deal with in full detail, it is common practice for roboticists to make abstractions of it. Every detail that is not relevant for the task at hand is abstracted away. In this paper, we will use the following abstractions: (i) We represent the state space of the world using a collection of boolean conditions. Each condition is a predicate that the robot can check for its truth value.
In this way, we basically divide the complex and analogue state-space of the world into a limited amount of discrete states. (ii) Complex actuation algorithms ("skills") of the robot are abstracted as a collection of actions A that can bring the world (with high probability) from a precondition to a postcondition. (iii) We assume that the conditions can be checked instantaneously while actions may take some (unknown) time to complete.
(iv) e mapping of states S to actions A is represented as a so-called action selection policy π: S ⟶ A.
Action selection policies can be implemented in di erent ways. A traditional way to implement such a policy is by using imperative programming language constructs, such as statement sequences, if-then-else, and while constructs. e example in Figure 3 shows a procedural policy, speci ed using pseudocode, for an industrial robot that has to move objects. e example robot has four actions that can be selected: "move_to_object," "grab_object," "move_to_goal," and "relea-se_object." Something might go wrong, and the robot should "stop." is results in a fairly simple ow depicted in the left part of the gure. However, things get complicated when sensor inputs are taking into account as well, see right part of Figure 3.
In general, action selection policies are strongly in uenced by the environment in which the robots have to work. Modern, autonomous robots usually have to operate in unstructured environments and have lots of sensors to deal with. Coding an action selection policy in the traditional, procedural way leads to complex code that is hard to develop, maintain, and improve. erefore, we prefer to model robot behavior explicitly instead of implicitly coding it in some programming language. Our preferred way of modelling is using Behavior Trees (BTs) [3,4]. BTs generalize Sequential Behavior Compositions, the Subsumption Architecture, and Decision Trees [5]. BTs are an e cient as well as exible way of creating complex systems that are both modular and reactive.
ese properties are crucial in many applications, which have led to the spread of BTs from computer game programming to many branches of AI and Robotics [6]. Several authors have already shown that Behavior Trees are a promising approach to model autonomous behaviour in dynamic environments [6][7][8][9][10][11], [25,27].
A growing amount of literature discusses fundamental aspects and applications of behavior trees for robotics [3,6,7]. Several authors provide formal de nitions of Behavior Trees [3,4] or propose extensions [12]. Others provide algorithms to implement goal-directed task plans using behavior trees and provide convergence proofs [13][14][15][16][17]. e goal of this paper is to bridge the gap between fundamental research and practical application of Behavior Trees for action selection in autonomous robots. We provide guidelines how to construct BTs to implement reactive, robust, and maintainable robot behavior. We also provide advice how to build a exible BT execution infrastructure. Section 2 introduces the BT formalism and its main syntax. Section 3 provides an overview of advantages and disadvantages of using BTs. e remaining sections provide lessons we learned during BT application (Section 4) and are aimed to help robot software engineers in building a exible execution architecture for BTs (Section 5). Leaf nodes represent the world conditions that are to be checked and the actions that can be selected:

BT Syntax and Semantics
(i) Condition-Check the state of the world (model).
(ii) Action-Act upon the world using a particular skill.
In practice, most BT libraries have all kinds of variations of the above introduced ow control nodes and also allow to extend the basic node set with domain-speci c nodes [7]. Action and condition nodes are mostly domain and application speci c. Figure 4 shows an example of a simple, intuitive BT of a robot that has to pick an object, move it to a destination, and place the object at that location. e action selection policy of the robot is fully de ned by the structure and semantics of the ow control and leaf nodes of the BT. e hierarchy of decision nodes concisely speci es all possible sequences of actions of the robot, and how these sequences are in uenced by conditions. Action selection ('robot brain')

World model
Actions Skills   Journal of Robotics

Sensors Actuators
To be able to execute a BT, each node supports a "tick" method. When this method is called (by its parent node), the node is activated. If a node needs more time than one tick to execute, the tick method returns status "RUNNING" to its parent. When the node completed its task, its tick method returns "SUCCESS" if its outcome was positive or "FAIL-URE" otherwise.
A BT is executed by calling the tick method of the root node at a certain frequency. Ticking the BTneeds to continue while its root node returns "RUNNING" and stop when the root node returns SUCCESS or FAILURE.
Each ow control node passes the tick on to its child nodes, one by one, or in case of a parallel node all at the same time. Eventually the tick reaches a leaf node. e tick allows this leaf node to execute itself, i.e., to check a condition, or to apply a skill. A ow control node that receives a status "RUNNING" from a child will return "RUNNING" to its parent. How it reacts to receiving "SUCCESS" or "FAIL-URE" status from a child depends on its type.
It is important to note that Sequence and Fallback nodes tick their children one by one. is automatically implies a priority: the rst child (from the left) is run before the second child and so on. is can be used to give certain actions higher priority than others.
A Fallback node works like a logical "OR" function, but instead of checking the status of its children all at once, it ticks its children one by one, starting with the rst, left-most child. Fallback is often used for trying prioritized alternatives that, when executed, have the same e ect, i.e., would result in the same postcondition.
A Sequence node works like a logical "AND" function, but again checking its children one by one instead of all at the same time. A Sequence enforces an all or nothing transaction. It can be used to run a sequence of actions as a transaction, or to always check a precondition before starting an action.

BT Trade-Offs
Behavior Trees provide an alternative to Hierarchical State Machines (HSMs). Both are equally expressive. We perceived the following advantages when applying BTs [3]: (i) Modularity, Reusability-BTs are modular on all scales. Any subtree from a BT or even any complete BT can be reused as a subtree in another BT. (ii) Readability-eir tree structure and modularity makes BTs well readable and therefore better analysable and understandable by humans. (iii) Maintainability-e tree structure of BTs is better maintainable than the transitions in HSMs. Decisions in BTs are represented explicitly as nodes, and are not hidden in state transition conditions. (iv) Learnability-BTs can be learned by (machine) learning programs [3] or generated from a formalized plan [15].
On the negative side, we should mention: (i) Veri ability-BTs are not yet supported by formal veri cation tools, contrary to HSMs. is shortcoming can, however, be addressed by translating the BT into an HSM and then conducting the veri cation, see, e.g., the Carve project (https://carve-robmosys. github.io/results/).

A Practical Approach to Define BTs
is section aims at providing a structured approach for robot practitioners for constructing robust and reactive BTs. We will describe four steps. ese steps result from our lessons learned when applying BTs in robot research projects at Fontys University of Applied Sciences. e steps we will elaborate on are: (1) Describe context and purpose of the robot.
(2) Basic plan-describe a "good weather" BT.   Specific idioms, needed to go from step 2 to 3, are introduced as an intermezzo. We illustrate the steps by means of a running example. e example that we have chosen is a mobile robot that picks up objects and places them at some goal location.

4.1.
Step 1: Describe Context and Purpose of the Robot. Before even starting to create a BT, a context description of the robot application helps to identify the relevant aspects to be covered by your BT. inking about the system context first is a general good design principle, which is, e.g., advocated by the C4-model (https://c4model.com/). e following aspects are helpful when defining the context for a robot application: system goal, actions, scenarios, and states; inspired by Winikoff and Padgham [18].
e system goal describes what the robot should do. e goal for our robot is moving objects to a goal location. is should start the thinking process. What actions does the robot need to perform? Which world and robot states should we be able to check? What action to execute in which state? e actions describe which action skills are needed by the robot to achieve its system goal? In case of our example robot, obvious actions will be "move to object," "pick object," "move to the goal," and "place object." In reality, there are usually many more actions to consider.
A scenario describes the dynamics of the robot, by linking the actions into a sequence, like first moving to an object object, then picking it up, then moving to the goal location, and finally placing it there. Although writing down a scenario might look straightforward, in practice often problems pop up when you think about all situations that can interrupt the normal sequence of actions, e.g., when our robot finds itself at a closed door or when its battery is empty. Do always take non-functional requirements, like power or safety, into account when thinking about scenarios. Being aware of contingencies, and thinking in terms of "bad weather" behavior for the robot, does in general help to define a robust BT (see step 3). In general the "good weather" scenario is just one of the scenarios to consider.
States are related to the scenarios because they define when actions are eligible for execution. In our example, the most important state of the robot is its position in the world. Dependent on its position (at origin, at goal), appropriate actions can be conducted, like: picking at origin, placing at goal. In general, there are many more states to consider than just the robot position, such as the location of the objects, the state of the robot gripper, or the state of the robot battery. Being aware of which states are relevant helps to refining the BT (see step 2).

4.2.
Step 2: Basic Plan, Define a "Good Weather" BT. When the context and purpose of the robot are defined, it is time to draw a first BT to describe the task that the robot has to do as a sequence of actions. is procedure should result in a first, intuitive version of the BT which should work when there are no disturbances, a so-called "good weather" BT.
In our running example, our BT is identical to the BT of Figure 4, a simple sequence of actions: "MoveToGoal," "PickObject," "MoveToGoal," and "PlaceObject." is is the same sequence that we would write down as a programmer.
Nothing gained yet by using a BT. However, the "good weather" scenario is only valid under the following assumptions: (1) e state of the world at the start is as required by the scenario. (2) All actions do eventually complete successfully.
(3) No external agent does interfere by changing the world state, while the scenario is running.
In practice, it is unlikely that all these assumptions hold true; in our example: (1) ere might not be an object at the pickup location.
(3) e robot might lose the object it has picked. (4) e robot might run out of (battery) power. (5) An external agent, such as a human or another robot, might take the object away and puts it at the goal location.
is implies that many more possible scenarios should be incorporated in our BT. In general, a good BT should be a concise model that describes all the possible scenarios, i.e., all possible sequences of actions. In order to select between various scenarios, we need to add conditions to the BT, i.e., we need to check the actual state of the world (including the robot itself ) to decide which action is eligible for execution.

Intermezzo: Adding Checks (Idioms) for Robustness.
e BT that was derived in step 2 encodes a "good weather" scenario and assumes that all actions go well. It does not react on any disturbances and just gives up if one of the actions fails. In the previous step, the robot is modelled by ordering the actions as then . . . else, but not taking the if, the condition, really into account. ese ifs, the conditions or guards, are added to ensure the appropriate context to execute the action. We can do better than that by adding checks to enrich the BT with other possible ways to execute to make the BT more robust.
Several checks can be added. A precondition check can be added to the action (or subtree) using a Sequence node, as depicted in Figure 5 (left). A postcondition check can be added using a Fallback node as depicted in Figure 5 (right).
A specific pattern that is constructed from these idioms is the postcondition precondition action (PPA) pattern [3]. Several actions can have the same postcondition but with different preconditions. By combining these alternatives using a Fallback node, we can improve the robustness of the BT. e PPA pattern is a subtree which combines fallback to alternative actions with precondition and postcondition checks. (see Figure 6)

4.4.
Step 3: Create a Robust Plan. Now, the idioms-to achieve robustness-have been introduced, we can apply them.
Step 3 is about the creation of a robust BT, to execute a deliberate, goal-oriented plan in an e ective and e cient way. e way to generate such a plan in the form of a BT can be done manually as well as automatically by a deliberative planner.
A straight forward way to de ne a robust BT is by using the "Robust Logical-Dynamical Chain" pattern [17] Figure 7, which is also described as "Implicit Sequence" in [3]: (1) Reverse the order of the actions, de ning most downstream action rst (reading from the left of the tree). (2) Replace the Sequence node by a Fallback node in order to change from an explicit to an implicit sequence. (3) Add a precondition check to each action.
When we apply this pattern on our running example, we reverse the order of the actions, compared to how they were de ned in the naïve BT or basic plan BT, Figure 4. e result is shown in Figure 8.
If the precondition of an action fails, we fall back to the next (more upstream) action, and so on. e order of the actions is reversed because we are primarily interested in running the last, most downstream action, because running that action will give us the outcome we want.
Note that in this case a so-called "ReactiveFallback" node is used. is node always ticks all its children (from left to right), even those that have already returned "SUCCESS" during earlier ticks.
An alternative approach to create a deliberative plan is expanding the BT by using "backward chaining" [13,14,16]. e method here is to iteratively expand preconditions. We start by taking the last, most downstream action, and work our way backwards. e precondition of this action is replaced with the PPA subtree which postcondition is the same as this precondition and so on. (see Figure 9) When applying backward chaining to our running example, the result will be the following robust BT Figure 10.
Martín et al. [15] describe yet another approach to generate a deliberate BT. ey start by generating a plan using a PDDL-based planner [19] and then automatically convert this plan into a robust BT for execution.

Step 4: Add Reactivity to the Plan.
e deliberative plan for the robot ensures that the robot will do its primary task. However, there might be contingencies to deal with. A robot can only ful ll its task when it remains safe and in good working condition. While it is executing its deliberate plan, problems may occur, like an empty battery. We do not know if and when these problems will occur, but we should take care of them as soon as possible. Handling contingencies is of higher priority than executing a deliberate plan. e Fallback node already handles this priority. Its leftmost child is ticked rst and thus has the highest priority. From left to right, priority goes down. Contingencies therefore have to be added to the left of the existing deliberative subtree as depicted in Figure 11.
In our running example, we introduce reactivity to the deliberative plan by adding a check on battery power level of the robot. If the power level will drop below a minimum level, the robot is triggered to move to its charging station and starts charging itself. is functionality is added as rows 3 and 4 in Figure 12, which is a supplement on the left compared to the deliberate BT that was earlier created in Figure 10. e resulting Figure 12 presents a robust and reactive BT.

BT Execution Architecture
Now, the basic steps for creating a robust BTs have been introduced and the BT-models have to be executed by real code.
is section deals with the construction of a BT Precondition-specifies what must be true to execute a skill. It is the condition that applies to a sequence of skills. In pseudocode: if pre do action.
Post condition-specifies what will be true when a skill is executed. It applies to a fallback of skills: if the condition is not fulfilled another action should be done. In pseudocode: if not post do action. execution architecture for the successful application of BTs in robots. Robot software is typically divided into three tiers [20], comprising a deliberator (planner) tier, a task executor (sequencer) tier, and a controller (skills) tier. e deliberator produces high-level task plans. e executor is responsible for the execution of those plans, by sequencing primitive skills. In a reactive robot architecture [8,21,22], the executor is in charge of driving the overall system by requesting new plans from the deliberator and dispatching subtasks to speci c robotic controller (skill) modules. In this case, the executor considers the deliberator as one of the available skills.
e three-tier architecture can be applied at multiple abstraction levels. Some skills might be quite complex and will require a three-tier structure themselves. For instance, a "NavigateTo" skill needs a motion planner to calculate a path in the map and a controller to follow that path while avoiding collisions and a recovery skill that gets the robot out of navigation problems. A BT-based executor could be used to coordinate these subskills [8].
A deliberator is essential in applications where plans need to be constructed at run time. Runtime planning takes time but does also provide optimized and e cient behavior. For a lot of applications, a good enough plan can be constructed at design time, either manually or using an automatic planner. is holds in particular for reactive, behaviorbased robotic applications [23]. Design time generated plans have the advantage that they can be validated or veri ed before deployment. Flexibility can be added to such a plan by runtime switching between prede ned subplans.
Skills [2,12,24] are basic building blocks of task plans. A robotic manipulator arm typically has skills like "Pick" and "Place" to be able to pick-up an object and place it somewhere. A mobile robot typically has skills like "NavigateTo" to plan a path and follow that path while avoiding objects. Basic skills related to object detection, navigation, and arm motion are common to a lot of robots. Added to those common skills, speci c application domains like manufacturing [24] or robot soccer [25] require domain speci c skills. e purpose of a skill is changing the state of the world what the robot lives in. By using its sensors, the robot can discover which objects are present in its world and where these objects are probably located and what their status is. Using sensor fusion and prediction, the robot can derive a WorldModel out of these observations, containing all beliefs that the robot has about its world. Skills can be implemented in a reusable manner by parameterization. Objects and locations can for instance be passed to skills as names of objects or locations that are attached to physical objects and locations by a lookup in the WorldModel.
For task planning and execution, it is important to note that a skill has a precondition to be successful and an e ect, i.e., a prediction of the postcondition after its execution. For the use of automatic planners, skills with their precondition and e ect are usually formally de ned using, e.g., PDDL [19]. e executor (sequencer) is the software module that coordinates the execution of a task plan by sequencing skills. In our case, the executor (sequencer) is a BT engine, i.e., a software module that instantiates a BT in memory and ticks it. e structure of the BT may be hardcoded in the source code but may also be derived dynamically from a specication le (in XML or JSON format). We prefer the latter.
A common platform for robot software development is the Robot Operation System (ROS (https://www.ros.org/)) [26]. ROS is a set of software libraries and tools for building robot applications. ROS2 [27], the next generation of ROS, is well suited for industrial applications. A successful BT execution architecture should be able to t in this infrastructure.
is open source library is currently being used in essential ROS2 packages, such as the Nav2 (https://navigation.ros.org) navigation stack [8]. It is a C++ based library that reads and runs behavior trees speci ed in XML. ese XML les can be created using any text editor but also a nice graphical editor is provided, called Groot (https://github.com/BehaviorTree/ Groot). is editor will not only be applicable to edit a Behavior Tree but does also monitor a BT-execution in real-time or enables replay of a trace of its execution.
When applying the BehaviorTree.CPP for your BTimplementation, all of the required BT control-, decorator-, condition-and action nodes need to be implemented in C++. A couple of frequently used BT nodes are provided. When other application or domain speci c nodes are needed, they must be implemented as C++ plugins. Various base-classes are provided to make it easier to construct such new behavior tree plugins. A good practice is to make these plugins as thin as possible and make them delegate the work to specialized servers.
BT engines do usually support some kind of a blackboard. is way parameters can be set by one node and read by another. BehaviorTree.CPP action and condition nodes can exchange data via named and typed ports on a blackboard. All input and output ports, including their name and type, must be declared in the source code (header le) of a node.
In ROS, an application is a computation graph of interacting ROS nodes. ROS nodes interact by using any of the following interface concepts: topics, services, and actions. Topics are meant to be used for publish/subscribe communication of continuous data streams (e.g., sensor data, robot state). Services are meant for remote procedure calls that terminate quickly, e.g., for querying or setting the state of a node or for calling a quick calculation. Actions should be used for activating long running activities such as making a plan or moving a robot to goal pose. An action can be preempted, i.e., cancelled before completion (see Figure 13).
A BT sequencer that is implemented in ROS should be a node that runs the BT engine. It enables the BT engine to read the BT structure from an XML le and instantiates the nodes and it makes it possible to tick the root node of the BT with a xed frequency, which will be somewhere between 10 and 1000 Hz. e BT action and condition plugins would act as action clients of ROS Action Servers or service clients of ROS Service Servers. Action and Service Servers would be implemented in other nodes that could have been written in C++ but could be implemented in Python as well. is is exactly how the ROS2 Nav2 navigation stack works [8].

Summary
Behavior Trees (BTs) are very well suited to model and implement the robot behavior. eir modularity is a big advantage because it enables robot engineers to quickly develop, adapt, and extend the behavior of a robot. Furthermore, the visual representation of the BTs makes them easier to read and to be modi ed compared to (larger) hierarchical state machines (HSM); in particular, when supported by a graphical editor. is paper provides a hands-on approach for robot software developers to apply BTs in their robot designs by: (1) a way of working for the stepwise construction of robust and reactive BTs and (2) the de nition of a exible architecture to execute BTs for ROS(2) based robots.
Data Availability e data used to support the ndings of the study can be obtained from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no con icts of interest.