nextupprevious
Next: Performance Results Up: The GRT Planning System Previous: The GRT Operation

7.      Related Work

This section briefly presents other domain independent heuristic state-space planning systems, by emphasizing their similarities and differences to Grt, in terms of the way in which they construct their heuristic and the direction they traverse the state-space. We omit certain pieces of related work that concern specific pre-processing techniques implemented in Grt, as for example the elimination of irrelevant objects, since they have already been presented in previous sections.

The recent evolvement of the domain independent heuristic planning started with the work of Drew McDermott (1996, 1999) on Unpop (UN-Partial Order Planner, UN- stands for non-). McDermott's planner is not restricted to pure Strips representations, supporting the more expressive language Adl (Pednault, 1989). The planner proceeds forward in the state-space. Distance estimates between states are based on the so-called regression graph, which is built from the goals using non fully-instantiated actions. Unpop does not consider subgoals interactions and reconstructs the regression graph from scratch for each intermediate state. Although this planner is not competitive enough, compared to the subsequent heuristic planners, it was the faster one at the time of its appearance. However, we have to note that Unpop has been developed in LISP, whereas the other heuristic planners are highly optimized C or C++ programs.

Although Unpop was the first domain independent heuristic planner, the area has been pushed forward by the Asp (Action Selection Planner, Bonet, Loerings & Geffner, 1997) and Hsp (Heuristic Search Planner, Bonet & Geffner, 1998) planners. The attractive feature of these planners is the simple way the heuristic is constructed, presented in Section 2.1. Asp used a best-first strategy with limited agenda, while Hsp uses a hill-climbing one with limited plateau search and restarts (an in-depth presentation of the state-space search algorithms is given by Zhang, 1999).

Both Asp and Hsp reconstruct their heuristic from scratch for each intermediate state. A variation, called Hspr (r stands for regression), constructs the heuristic only once (Bonet & Geffner, 1999). This approach resembles Grt, although Hspr constructs the heuristic forward and searches backwards. Both approaches have the problem of incomplete goal states, however it arises in different phases of the planning process. Grt faces this problem in the pre-processing phase, by enhancing the goals, as it has been described in Section 3. In Hspr, the problem arises in the search phase, in the form of invalid states in the regression state space. To cope with the problem, Hspr computes mutual exclusion relations and checks each state in the regression state space for any possible violation of these relations. The disadvantage of this approach is that it is considerably more time consuming than the Grt approach, since the Hspr has to check each visited state.

A variation of Hsp, named Hsp-2, changed the hill-climbing strategy to a best-first one, thus preserving completeness and producing better plans (Bonet & Geffner, 2001). Moreover, Hsp-2 uses a weighted A* algorithm (WA*) (Pearl, 1983) of the form f(S)=g(S)+W h(S), where S is an intermediate state, g(S) is the accumulated cost from the initial state, h(S) is the estimated cost to reach the Goals and W is a parameter. For W=0, the search algorithm behaves as a breadth-first one, for W=1 it behaves as the typical A* and for Wà it behaves as best-first. For the h(S) function, Hsp-2 supports several heuristic functions, apart from the one presented in Section 2.1.

Recently, two new planners, Ff and Altalt, appeared, which use a Graphplan-based approach to estimate distances between the intermediate states and the goals. Altalt (A Little of This, A Little of That) is a regression planner based on Hspr, which faces the same problems with invalid states as Hspr (Nigenda, Nguyen & Kambhampati, 2000). Altalt creates a planning graph in a pre-processing phase and uses several techniques to extract heuristic estimates of the distances between the intermediate states and the initial state. For example, one of them returns the level in the planning graph, where all the facts of the intermediate state appear, without any mutual exclusion relation between them.

Ff (Fast Forward) is a forward heuristic planner (Hoffmann & Nebel, 2001). In order to estimate the distance between an intermediate state and the goals, Ff creates a planning graph from the state to the goals, using relaxed actions. Since there are no delete effects, there are no mutual exclusion relations in the planning graph. From this graph, Ff extracts a relaxed plan, the length of which is the distance estimate. Note that, since there are no mutual exclusion relations, no backtracking occurs during the extraction of the relaxed plan, thus the extraction is accomplished fast enough. The Ff heuristic resembles the Grt one, in that both aim in obtaining under-estimates, but they adopt different approaches. The relaxations that Ff performs are stronger, since it completely ignores the delete effects. So the Ff estimates are usually smaller than the Grt's ones and most of the times are underestimates, whereas Grt not-rarely produces overestimates.

Ff adopts a variation of the hill-climbing strategy, called enforced hill climbing, according to which, the planner always seeks to move to a state closer to the goals, according to its heuristic. Ff achieves that by performing a bounded breadth-first search from the current state, with a maximum depth defined by the user; so the improving state does not have to be a direct successor of the current state. Once that an improving state is found, the new actions are added to the end of the current plan and the hill-climbing search continues from the new state. In the case where the bounded breadth-first search does not find an improving state, Ff restarts the search from the initial state adopting a best-first search strategy.

Ff exhibited distinguishable performance at the Aips-00 planning competition. One of the features of Ff resulting in its good performance is that it does not compute the applicable actions for each intermediate state. Actually, Ff gives priority to the first level actions of the relaxed plan. Once that an action that produces a better state is found, it is applied and the next state is processed. Moreover, at most of the times, no new relaxed plan has to be constructed, since it suffices to remove the lastly applied action from the beginning of the previous relaxed plan. So, Ff succeeds in reducing drastically the cost of processing each intermediate state, paying however the cost of loosing completeness.

The bottleneck that occurs while determining the applicable actions for each intermediate state has also been identified by Vrakas et al. (1999, 2000). In this work, the process of finding and applying the applicable actions has been parallelized, resulting in almost linear speedup. Parallelizing the process of finding the applicable actions, instead of ignoring most of them, as Ff does, presents the advantage of preserving completeness; however, the cost is that a parallel machine is required.

We close the reference to other heuristic state-space planners with the Stan planning system (Fox & Long, 1998; Long & Fox, 1999). Stan is not a heuristic state-space planner, at least in its basic architecture, but a graph-based planner, which uses several pre-processing techniques for extracting useful domain information that is exploited for more efficient graph construction and solution extraction. However, in the Aips-00 competition a hybrid architecture was used (Long & Fox, 2000; Fox & Long, 2001), where a heuristic state-space planning module was employed to solve specific identified sub-problems. Thus Stan succeeded in improving its performance, especially in cases of transportation domains.

Concerning problem decomposition, work has been done on goal ordering (Cheng & Irani, 1989; Drummond & Currie, 1989). Recently a similar approach has been proposed by Koehler (1998) and has been extended by Koehler and Hoffmann (2000). This approach automatically derives an ordering relation between the goal facts, which can be used by any planner to search for increasing sets of subgoals. The advantage of this approach is that no extra information is needed, except for the usual domain definition, while the disadvantage, with respect to the XOR-constraints approach, is that only the goal facts are taken into account in the intermediate states that are constructed. This approach has been adopted by the Ff planning system.

 

nextupprevious
Next: Performance Results Up: The GRT Planning System Previous: The GRT Operation

Ioannis Refanidis

14-8-2001