On the Completeness of Pruning Techniques for Planning with Conditional Effects

Pruning techniques and heuristics are two keys to the heuristic search-based planning.The helpful actions pruning (HAP) strategy and relaxed-plan-based heuristics are two representatives among thosemethods and are still popular in the state-of-the-art planners. Here, we present new analyses on the properties ofHAP. Specifically, we shownew reasons for whichHAP can cause incompleteness of a search procedure. We prove that, in general, HAP is incomplete for planning with conditional effects if factored expansions of actions are used. To preserve completeness, we propose a pruning strategy that is based on relevance analysis and confrontation. We will show that both relevance analysis and confrontation are necessary. We call it the confrontation and goal relevant actions pruning (CGRAP) strategy. However, CGRAP is computationally hard to be exactly computed. Therefore, we suggest practical approximations from the literature.


Introduction
The research of AI planning has advanced to a new level.Pioneers from the AI planning community have developed various practical methods that can solve much larger problems than those toy problems in early days.Two such well known methods are SAT-based planning [1], and heuristic search-based planning [2].SAT-based planning translates planning problems into propositional satisfiability problems or more general constraint satisfaction problems (CSPs) [3].An obvious advantage of the method is that it can exploit the power of fast algorithms from the CSP literature [4][5][6].On the other hand, the "planning as search" community has been pursuing more informative heuristics to make the search fast [7,8].While recent studies show that planning specific heuristics can make SAT-based planning methods more competitive [9,10], the planning as search method has shown its potential in many kinds of planning problems including classical planning [11,12], conformant planning [13], contingent planning [14], and probabilistic planning [15].Recent International Planning Competitions (IPCs) http://ipc.icaps-conference.org/have witnessed the success of heuristic search-based planners, since the winners: Fast-Forward (FF) [11], LPG [16], SGPlan [17], and Fast Downward [12] and its successors-LAMA [18] all employ heuristic search.Two enabling techniques underlying heuristic searchbased planning are heuristic functions and pruning techniques.A heuristic function measures the distances to goal of states, while pruning techniques eliminate branches that are safe to ignore.Here, we focus on the pruning techniques.
HAP is a pruning strategy developed in FF with the idea of making the search process goal directed [19].Though, initially, it was a byproduct of the relaxed-plan heuristic, its notion has been popular and important in the design of top performance planners, such as Fast Downward [12].However, the HAP strategy does not guarantee completeness; that is, it may cut branches that can reach the goal.Some of these cases were explained by Hoffmann and Nebel [11].Nevertheless, here, we uncover a new case in which HAP can cause incompleteness.We study the conditions under which the new case will occur, the way to remedy the HAP strategy, and the cost of doing that.Based on our work, one can gain more insights into why Fast Downward, which employs the helpful transitions strategy, is powerful.The rest of the paper is organized as follows.In the next section, we introduce some background.Then, we show the incompleteness of HAP and extend it to a more general one called goal relevant actions pruning (GRAP), which is complete only for STRIPS planning.In Section 4, we propose our confrontation and goal relevant actions pruning (CGRAP), which is complete for both STRIPS planning and actions with conditional effects.In Section 5, we discuss some pruning techniques in the literature that can be seen as approximations of CGRAP.Finally, we conclude the paper and discuss some future work.

Background
We will first introduce notations from the state space searchbased planning, then methods for handling conditional effects, and finally the heuristic function and the HAP strategy in FF.
A planning task is a quadtuple T = (P, A, I, G), where P is the set of atoms, A is the set of actions, I ⊆ P is a set of atoms called the initial state, and G ⊆ P is the goal condition that each goal state must fulfill.States are denoted by sets of atoms.We adopted the "closed world" assumption; that is, an atom that is not in a state is false in the state.So, if P = {, , } and a state  = {, }, then  is logically equal to  ∧  ∧ ¬.An action  is a pair ⟨pre(), ()⟩, where pre() is a set of atoms denoting the preconditions of  and () is the set of conditional effects of .Each conditional effect  ∈ () has the form ⟨con(), add(), del()⟩, where con(), add(), and del() are conditions, add effects, and delete effects of , respectively.For an action  and a state , if pre() ⊆ , then we say  is applicable in .We use App() = { | pre() ⊆ } to denote all the actions that are applicable in .The execution of  on , denoted by (), results in a state   , where   =  − ⋃ :con()⊆ del() + ⋃ :con()⊆ add() if  is applicable in , and   =  otherwise.A state  is called a goal state if  ⊆ .A plan for a planning task is an action sequence  = ⟨ 0 ,  1 , . . .,  −1 ⟩ that transforms the initial state  into a goal state.We use || to denote its length, which is the number of actions in it.Here, we assume that a plan does not have redundant actions; that is, when some action is removed from , it will no longer be a plan.
Actions with conditional effects were introduced in the planning problem description language "Action description language" (ADL) [20].And there are mainly three ways for handling conditional effects.Conditional effects of actions are expressed with the keyword "when".Figure 1 (left) shows an action with two conditional effects: the first effect  will happen with no condition and the second effect ¬ will happen if  ∧  holds in the state where the action is executed.The three ways for handling conditional effects are, full expansion [21], IPP's method [22] and factored expansion [23].Here, we focus on IPP's method, which is used in FF.The method will translate the action in Figure 1 (left) into the form shown in Figure 1 (right).From now on, we will call planning with actions with conditional effects ADL planning.
FF employs a forward state space search framework.Three key techniques of FF are the relaxed-plan-based heuristic (RP), HAP, and the enforced hill-climbing (EHC) algorithm.Here, we focus on RP and HAP.A relaxed plan is extracted from a relaxed version of a planning task where the delete effects of actions are ignored.Specifically, the relaxed version of a planning task T = (P, A, I, G) is T  = (P, A  , I, G), where A  = {⟨pre(),   ()⟩ |  ∈ A} and   () = {⟨con(), add(), 0⟩ | ⟨con(), add(), del()⟩ ∈ ()}.An action sequence  = ⟨ 0 ,  1 , . . .,  −1 ⟩ is called a relaxed plan for T if it is a plan for T  .For a state , its heuristic value is the length of a relaxed plan for the planning task T s = (P, A, s, G).Note that the relaxed plan for T s is not unique and FF finds one using the Graphplan [24] algorithm.During the process of computing a relaxed plan, FF keeps track of the subgoals generated in the second propositional level of a planning graph, which is saved in a set  1 ().Helpful actions are the set of actions () = { | pre() ⊆  ∧ ∃ ∈ () : (con() ⊆ ) ∧ (add() ∩  1 () ̸ = 0)}.For a state , the EHC algorithm only considers actions in () and ignores others.This search strategy is called helpful actions pruning.We say that a strategy is complete if using it does not make a complete algorithm eliminate search branches that are directions to goal states.As shown in [11], HAP is incomplete for STRIPS planinng.

Complete Pruning Strategy for STRIPS
In this section, we will extend HAP to a complete strategy for STRIPS that we call goal relevant actions pruning (GRAP).We will then prove the completeness of GRAP for STRIPS and show that GRAP is incomplete for ADL planning.As a result, we will extend GRAP to a complete strategy for ADL planning in the next section.

Goal Relevant Actions.
Helpful actions for a state  are actions in App() that is relevant for adding the subgoals in  1 ().To obtain completeness, our goal relevant actions for  are actions in App() that is relevant for adding every (sub)goal generated by the GraphPlan algorithm.
Definition 1 (Dependency among Facts).For two facts ,  ∈ P and a set of actions A,  is dependent on  with respect to A (denoted as Definition 2 (Dependency between Facts and Actions).For an atom  ∈ P and an action  ∈ A,  is dependent on  with respect to A (denoted as We note that Definitions 1 and 2 capture the relevant facts and actions to a goal.Specifically, if we are going to reach a goal , then the actions on which  is dependent are relevant, and further, actions that adds facts on which  is dependent are also relevant.Note that in the previous definitions we use "dependent, " instead of "relevant, " to indicate a directional relation.Now, we are ready to introduce the notion of goal relevant actions.The actions are those a search algorithm could explore for reaching some goal state.Actions that are not relevant are to be ignored.We propose the following pruning strategy based on GRA.For any search algorithm and any state , we only consider actions in REL ,⊲ A () and ignore those in App() − REL ,⊲ A ().We call the strategy GRA pruning (GRAP).We will prove that GRAP is a generalization of HAP and is complete for STRIPS planning.P, A, I, G) and any state  of , () ⊆ REL ,⊲ A ().

As the directional relation ⊲
Next, we will prove that GRAP is complete for STRIPS planning.

Proposition 5. GRAP is a complete pruning strategy for STRIPS planning.
Proof.Let T = (P, A, s, G) be a STRIPS planning task, and let  = ⟨ 0 ,  1 , . . .,  −1 ⟩ be one of the plans for .Note that  is not redundant.As we restrict to STRIPS planning, each action  ∈ A has only one conditional effect, which is denoted by ().We will prove that  0 ∈ REL ,⊲ A ().We will compute DEP ⊲ A () with  =  − 1, . . ., 0. Initially, DEP ⊲ A () = .For the action  −1 , if its effect ( −1 ) does not add a fact in  then  with  −1 removed will still be a plan.This contradicts the assumption that  is not redundant.Following Definition 1, DEP ⊲ A () = DEP ⊲ A () ∪ pre( −1 ).Similarly, for   (0 <  <  − 1),   must add a fact in DEP ⊲ A () ∪ pre( −1 ) ∪ ⋅ ⋅ ⋅ ∪ pre( +1 ); otherwise,  with   removed is still a plan, which contradicts the assumption that  is not redundant.Therefore, DEP ⊲ A () = DEP ⊲ A () ∪ pre( −1 ) ∪ ⋅ ⋅ ⋅ ∪ pre( +1 ).For  0 , as  is not redundant, it must hold that ( 0 ) ∩ ( ∪ pre( −1 ) ∪ ⋅ ⋅ ⋅ ∪ pre( 1 )) ̸ = 0. Following Definition 2,  0 is in DEP ⊲ A ().And according to Definition 3,  0 ∈ REL ,⊲ A () holds.As  is an arbitrary plan, we finish the proof.Note that the above proof cannot be adapted to prove the completeness of HAP, as helpful actions are defined with respect to a specific relaxed plan.In other words, the arbitrariness of plans is not guaranteed.

GRAP is Incomplete for ADL Planning. Hoffmann and
Nebel [11] pointed out that HAP is incomplete as the GraphPlan algorithm is greedy in computing shorter relaxed plans.Therefore, the source of this incompleteness could be eliminated if we use other algorithms to compute relaxed plans, other than GraphPlan.The method proposed by Hoffmann and Nebel [11] works in the following way: for a state  and a relaxed planning task T  = (P, A  , s, G), they expand the planning graph to the level |A  | and collect subgoals backward from the level |A  | to level 1.Specifically, let    be the subgoals at level , then Following the method,   1 is the union of subgoals of every relaxed plan for   at level 1.We call actions in { |  ∈ App(), ∃ ∈ () : add() ∩    1 ̸ = 0} "full helpful actions" (FHAs).Intuitively, an FHA is equivalent to our definition REL ,⊲ A ().However, the former is more procedural, and ours is more formal.We will use our definition to develop a new pruning strategy for ADL planning.Before that, we will show, by example, that both the FHA pruning (FHAP) strategy and GRAP are generally incomplete for ADL planning.Example 6.Given a planning task T  , where P  = {, , , , }, I  = {,,}, G  = {, }, and A  = { 1 ,  2 } where  1 is ({}, {(0, {}, 0), ({}, {}, {})}) and  2 is (, (0, 0, {})).The meaning of  1 is as follows: its preconditions are {} and it has two conditional effects-the first is (0, {}, 0) (denoted as  0 ) and the second is ({}, {}, {}) (denoted as  1 ).The action  2 has one condition , and has a conditional effect (0, 0, {}) that falsifies .
Next, let us consider the plan for Example 6.The difference between G  and I  is that the atom  is not  in I  .
To make  true, we would use action  1 .However, executing  1 on I  will result in a state   = {, , } where atom  does not hold.After that, there is no action that can transform   into a goal state.One could notice that this dead end is due to the fact that  0 and  1 both happened and  1 destroyed .If we could prevent  1 from happening, then we would succeed in finding a plan.This is the idea proposed by Weld [25], which is called "confrontation." It is easy to see that with "confrontation" as a choice, we can find a plan <  1 ,  0 >.
In Example 6,  1 is relevant for reaching a goal state.However, pruning strategies HAP, GRAP, and FHAP all ignore it.With a generalization, we have the following results.P, A, s, G) be an ADL planning task and the set of plans for T be (T).If every plan  ∈ (T) contains an action  of the form (pre(a), {(con(e 0 ), 0, del(e 0 )), . . ., (con(e m ), 0, del(e m ))}), then HAP, FHAP, and GRAP are incomplete for T.

Proposition 7. Let T = (
The correctness of Proposition 7 is in that if an action does not have any add effects, then it will not be considered as "helpful" or "relevant" anyway.As a result, this kind of actions will be mistakenly ignored.
From Example 6, we can see that actions that make "confrontations" are also "helpful" and "relevant." Following this direction, we extend GRAP to a new pruning strategy that is complete for both STRIPS and ADL planning.

Complete Pruning Strategy for ADL Planning
We first introduce the notion of "confrontation and goal relevant actions" and then prove that its corresponding pruning strategy CGRAP is complete for ADL planning.

Definition 8 (Confrontational Dependency among Facts).
For two atoms l,  ∈ P, a set of actions A,  is confrontationally dependent on  with respect to A (denoted as According to the previous two definitions, one could notice that actions that add or delete an atom  are considered as relevant to .

Definition 10 (Confrontation and Goal Relevant Actions).
Given a planning task T = (P, A, I, G), actions that are confrontationally relevant to  ∈ P are DEP ⊴ A () = { |  ∈ A  ⊴ A }, and actions that are confrontationally relevant to G ⊆ P are DEP ⊴ A (G) = ⋃ ∈G {DEP ⊴ A ()}.Given a state , the "confrontation and goal relevant actions" for  is REL G,⊴ A () = DEP ⊴ A (G) ∩ App().
The pruning strategy that considers actions in REL G,⊴ A () only, that is, ignores App() − REL G,⊴ A () is called confrontation and goal relevant actions pruning (CGRAP).In the following proposition, we will prove that CGRAP is complete for ADL planning.Proposition 11.CGRAP strategy is complete for ADL planning.
The completeness of CGRAP for ADL planning costs.One reason is that the pruning power of CGRAP is weak.In other words, CGRAP may cut a rather limited amount of branches of a search space.In addition, computing CGRAP is PSPACE-hard, as deciding irrelevant actions for a planning task is PSPACE-hard [26].Therefore, it is practical to collect confrontation and goal relevant actions in an approximate way.In the next section, we will discuss some methods from the literature that fall into this scope.

Discussion
We will first review the helpful transition notion developed in Fast Downward [12] and then the delayed partly reasoning procedure [27].
Fast Downward is a representative planner that uses the SAS + planning formalism [28], which supports multivalued variables.It translates a propositional planning problem into an SAS + planning problem by utilizing an invariants analysis procedure [12].After that, Fast Downward builds a causal graph that involves all the variables and a domain transition graph for each variable.Dependencies among variables are reasoned through the causal graph, and dependencies among values of a variable are reasoned through the corresponding domain transition graph.For details of the two kinds of graphs, please refer to [12].The goal distance of a state  is the sum of goal distances of variables.For each variable, its goal distance is computed by solving a shortest path problem formulated on the corresponding domain transition graph.In the problem, the source node is the value the variable currently takes, and the target node is the value that goal conditions require.When such a path is obtained, the transition associated with the first edge is labeled as "helpful transition." Note that transitions are conditional effects of actions.So, we can collect "helpful actions" based on "helpful transitions." Here we note that "helpful transitions" consider both goal relevant actions and actions that are for confrontations.The ability of collecting actions that are helpful for confrontations originates from the multivalued variable representation.In the representation, the change from one value to another one models both the add and delete effects of an action on a variable.As a result, actions have only one kind of effects, which are considered by Fast Downward to collect "helpful transitions." In contrast, propositional planners, such as FF, consider only the add effects of actions for collecting "helpful actions." Therefore, the "helpful transitions" strategy can be considered as an approximation of CGRAP.
The "delayed partly reasoning procedure" is proposed by Cai et al. [7].This procedure is implemented on top of the propositional planning formalism.In the first action level of a planning graph, the procedure tracks harmful inducements with respect to an order of conditional effects and collects actions that confront the inducements.An inducement is that one conditional effect  induces another conditional effect   .It is harmful if   deletes some previously added atoms of other conditional effects.As the procedure only operates on the first actions level and works with a predefined order, it is an approximation of CGRAP.Therefore, the computational cost of the procedure is not high.

Conclusions and Future Work
In this work, we analyzed some well-known pruning techniques, which are currently utilized by state-of-the-art planners.In particular, we showed that the helpful actions pruning strategy is incomplete for ADL planning and extended it to a complete strategy called confrontation and goal relevant actions pruning.Though our proposed strategy is computationally hard, we discussed methods from the AI planning literature that can be seen as approximations of it.In addition, we believe that this work will help us gain more insights into why the planner Fast Downward is powerful.
This work was done on pruning techniques in searchbased planning.Future directions may consider pruning techniques in SAT-based ADL planning and conformant planning.As IPP's method for handling conditional effects does not lead to a high increase in problem size, it is suitable for SAT-based planning.Therefore, developing adaptations or approximations of our proposed strategy CGRAP in that settings could be interesting.

Figure 1 :
Figure 1: An action with conditional effects (a) and its compiling result by IPP (b).

Definition 3 (
Goal Relevant Actions, GRA).Given a planning task T = (P, A, I, G), actions that are relevant to  ∈ P are DEP ⊲ A () = { |  ∈ A  ⊲ A }, actions that are relevant to G ⊆ P are DEP ⊲ A (G) = ⋃ ∈G DEP ⊲ A ().Given a state , the "goal relevant actions" for  is REL G,⊲ A () = DEP ⊲ A (G) ∩ App().