Next: Conclusion and Future Work Up: Macro-FF: Improving AI Planning Previous: Evaluating the Effects of

Related Work

The related work described in this section falls into two categories. We first review approaches that make use of the domain structure to reduce the complexity of planning, and next consider previous work on macro-operators.

An automatic method that discovers and exploits domain structure has been explored by Knoblock [16]. In this work, a hierarchy of abstractions is built starting from the initial low-level problem description. A new abstract level is obtained by dropping literals from the problem definition at the previous abstraction level. Planning first produces an abstract solution and then iteratively refines it to a low-level representation. The hierarchy is built in such a way that, if a refinement of an abstract solution exists, no backtracking across abstraction levels is necessary during the refinement process. Backtracking is performed only when an abstract plan has no refinement. Such situations can be arbitrarily frequent, with negative effects on the system's performance.

Bacchus and Yang [3] define a theoretical probabilistic framework to planning in hierarchical models. Abstract solutions of a problem at different abstraction levels are hierarchically represented as nodes in a tree structure. A tree edge indicates that the target node is a refinement of the start node. An abstract solution can be refined to the previous level with a given probability. Hierarchical search in this model is analytically evaluated. The analytical model is further used to enhace Knoblock's abstraction algorithm. The enhancement refers to using estimations of the refinement probabilities for abstract solutions.

More recently, implicit causal structure of a domain has been used to design a domain-independent heuristic for state evaluation [9]. These methods either statically infer information about the structure of a domain, or dynamically discover the structure for each problem instance. In contrast, we propose an adaptive technique that learns from previous experience in a domain.

Two successful approaches that use hand-crafted information about the domain structure are hierarchical task networks and temporal logic control rules. Hierarchical task networks (HTNs) guide and restrict planning by using a hierarchical representation of a domain. Human experts design hierarchies of tasks that show how the initial problem can be broken down to the level of regular actions. The idea was introduced by Sacerdoti [27] and Tate [28], and has widely been used in real-life planning applications [31]. SHOP2 [26] is a well-known heuristic search planner where search is guided by HTNs.

In planning with temporal logic control rules, a formula is associated with each state in the problem space. The formula of the initial state is provided with the domain description. The formula of any other state is obtained based on its successor's formula. When the formula associated with a state can be proven false, that state's subtree is pruned. The best known planners of this kind are TLPlan [2] and TALPlanner [18]. While efficient, these approaches also rely heavily on human knowledge, which might be expensive or impossible to obtain.

Early contributions to macro-operators in AI planning includes the work of Fikes [8]. Macros are extracted after a problem was solved and the solution became available. Minton [23] advances this work by introducing techniques that filter the set of learned macro-operators. In his approach, two types of macro-operators are preferred: S-macros, which occur with high frequency in problem solutions, and T-macros, which can be useful but have low priority in the original search algorithm. Iba [13] introduces the so-called peak-to-peak heuristic to generate macro-operators at run-time. A macro is a move sequence between two peaks of the heuristic state evaluation. Such a macro traverses a ``valley'' in the search space, and using this later can correct errors in the heuristic evaluation. Mooney [25] considers whole plans as macros and introduces partial ordering of operators based on their causal interactions.

Veloso and Carboness [29] and Kambhampati [15] explore how planning can reuse solutions of previously solved problems. Solutions annotated with additional relevant information are stored for later use. This additional information contains either explanations of successful or failed search decisions [29], or the causal structure of solution plans [15]. Several similarity metrics for planning problems are introduced. When a new problem is fed to the planner, the annotated solutions of similar problems are used to guide the current planning process.

McCluskey and Porteous [21] focus on constructing planning domains starting from a natural language description. The approach combines human expertise and automatic tools, and addresses both correctness and efficiency of the obtained formulation. Using macro-operators is a major technique that the authors propose for efficiency improvement. In this work, a state in a domain is composed of local states of several variables called dynamic objects. Macros model transitions between the local states of a variable.

The planner MARVIN [7] generates macros both online (as plateau-escaping sequences) and offline (from a reduced version of the problem to be solved). No macros are cached from one problem instance to another.

Macro-moves were successfully used in single-agent search problems such as puzzles or path-finding in commercial computer games, usually in a domain-specific implementation. The sliding-tile puzzle was among the first testbeds for this idea [17,13]. Two of the most effective concepts used in the Sokoban solver ROLLING STONE, tunnel and goal macros, are applications of this idea [14]. More recent work in Sokoban includes an approach that decomposes a maze into a set of rooms connected by tunnels [4]. Search is performed at the higher level of abstract move sequences that rearrange the stones inside a room so that a stone can be transferred from one room to another. Using macro-moves for solving Rubik's Cube puzzles is proposed by Hernádvölgyi [10]. A method proposed by Botea, Müller, and Schaeffer [5] automatically decomposes a navigation map into a set of clusters, possibly on several abstraction levels. For each cluster, an internal optimal path is pre-computed between any two entrances of that cluster. Path-finding is performed at an abstract level, where a macro-move crosses a cluster from one entrance to another in one step.

Methods that exploit at search time the relaxed graphplan associated with a problem state [12] include helpful action pruning [12] and look-ahead policies [30]. Helpful action pruning considers for node expansion only actions that occur in the relaxed plan and can be applied to the current state. Helpful macro pruning applies the same pruning idea for the macro-actions applicable to a state, with the noticeable difference that helpful macro pruning does not give up completeness of the search algorithm. A lookahead policy executes parts of the relaxed plan in the real world, as this often provides a path towards a goal state with no search and few states evaluated. The actions in the relaxed plan are iteratively applied as long as this is possible in a heuristically computed order. When the lookahead procedure cannot be continued with actions from the relaxed plan, a plan-repair method selects a new action to be applied.

Next: Conclusion and Future Work Up: Macro-FF: Improving AI Planning Previous: Evaluating the Effects of

Adi Botea 2005-08-01