This section briefly presents other
domain independent heuristic state-space planning systems, by emphasizing their
similarities and differences to Grt,
in terms of the way in which they construct their heuristic and the direction
they traverse the state-space. We omit certain pieces of related work that
concern specific pre-processing techniques implemented in Grt, as for example the elimination of
irrelevant objects, since they have already been presented in previous
sections.
The recent evolvement
of the domain independent heuristic planning started with the work of Drew
McDermott (1996, 1999) on Unpop
(UN-Partial Order Planner, UN- stands for non-). McDermott's planner is not
restricted to pure Strips
representations, supporting the more expressive language Adl (Pednault, 1989). The planner
proceeds forward in the state-space. Distance estimates between states are
based on the so-called regression graph, which is built from the goals using
non fully-instantiated actions. Unpop
does not consider subgoals interactions and reconstructs the regression graph
from scratch for each intermediate state. Although this planner is not
competitive enough, compared to the subsequent heuristic planners, it was the
faster one at the time of its appearance. However, we have to note that Unpop has been developed in LISP,
whereas the other heuristic planners are highly optimized C or C++ programs.
Although Unpop was the first domain independent
heuristic planner, the area has been pushed forward by the Asp (Action Selection Planner,
Bonet, Loerings & Geffner, 1997) and Hsp
(Heuristic Search Planner, Bonet & Geffner, 1998) planners. The
attractive feature of these planners is the simple way the heuristic is
constructed, presented in Section 2.1. Asp
used a best-first strategy with limited agenda, while Hsp uses a hill-climbing one with limited plateau search and
restarts (an in-depth presentation of the state-space search algorithms is
given by Zhang, 1999).
Both Asp and Hsp reconstruct their heuristic from scratch for each
intermediate state. A variation, called Hspr
(r stands for regression), constructs the heuristic only once
(Bonet & Geffner, 1999). This approach resembles Grt, although Hspr
constructs the heuristic forward and searches backwards. Both approaches have
the problem of incomplete goal states, however it arises in different phases of
the planning process. Grt faces
this problem in the pre-processing phase, by enhancing the goals, as it has
been described in Section 3. In Hspr,
the problem arises in the search phase, in the form of invalid states in the
regression state space. To cope with the problem, Hspr computes mutual exclusion relations and checks each state
in the regression state space for any possible violation of these relations.
The disadvantage of this approach is that it is considerably more time
consuming than the Grt approach,
since the Hspr has to check each
visited state.
A variation of Hsp, named Hsp-2, changed the hill-climbing strategy to a best-first
one, thus preserving completeness and producing better plans (Bonet &
Geffner, 2001). Moreover, Hsp-2
uses a weighted A* algorithm (WA*) (Pearl, 1983) of the form f(S)=g(S)+W h(S), where S is an intermediate state, g(S)
is the accumulated cost from the initial state, h(S) is the
estimated cost to reach the Goals and W is a parameter. For W=0,
the search algorithm behaves as a breadth-first one, for W=1 it behaves
as the typical A* and for Wà₯ it behaves as best-first. For the h(S)
function, Hsp-2 supports several
heuristic functions, apart from the one presented in Section 2.1.
Recently, two new
planners, Ff and Altalt, appeared, which use a Graphplan-based approach to estimate
distances between the intermediate states and the goals. Altalt (A Little of This, A
Little of That) is a regression planner based on Hspr, which faces the same problems with invalid states as Hspr (Nigenda, Nguyen & Kambhampati,
2000). Altalt creates a planning
graph in a pre-processing phase and uses several techniques to extract
heuristic estimates of the distances between the intermediate states and the
initial state. For example, one of them returns the level in the planning
graph, where all the facts of the intermediate state appear, without any mutual
exclusion relation between them.
Ff (Fast
Forward) is a forward heuristic planner (Hoffmann & Nebel, 2001). In
order to estimate the distance between an intermediate state and the goals, Ff creates a planning graph from the
state to the goals, using relaxed actions. Since there are no delete effects,
there are no mutual exclusion relations in the planning graph. From this graph,
Ff extracts a relaxed plan,
the length of which is the distance estimate. Note that, since there are no
mutual exclusion relations, no backtracking occurs during the extraction of the
relaxed plan, thus the extraction is accomplished fast enough. The Ff heuristic resembles the Grt one, in that both aim in obtaining
under-estimates, but they adopt different approaches. The relaxations that Ff performs are stronger, since it
completely ignores the delete effects. So the Ff
estimates are usually smaller than the Grt's
ones and most of the times are underestimates, whereas Grt not-rarely produces overestimates.
Ff adopts a
variation of the hill-climbing strategy, called enforced hill climbing,
according to which, the planner always seeks to move to a state closer to the
goals, according to its heuristic. Ff
achieves that by performing a bounded breadth-first search from the current
state, with a maximum depth defined by the user; so the improving state does
not have to be a direct successor of the current state. Once that an improving
state is found, the new actions are added to the end of the current plan and
the hill-climbing search continues from the new state. In the case where the
bounded breadth-first search does not find an improving state, Ff restarts the search from the initial
state adopting a best-first search strategy.
Ff exhibited
distinguishable performance at the Aips-00
planning competition. One of the features of Ff
resulting in its good performance is that it does not compute the applicable
actions for each intermediate state. Actually, Ff gives priority to the first level actions of the relaxed
plan. Once that an action that produces a better state is found, it is applied
and the next state is processed. Moreover, at most of the times, no new relaxed
plan has to be constructed, since it suffices to remove the lastly applied
action from the beginning of the previous relaxed plan. So, Ff succeeds in reducing drastically the
cost of processing each intermediate state, paying however the cost of loosing
completeness.
The bottleneck that
occurs while determining the applicable actions for each intermediate state has
also been identified by Vrakas et al. (1999, 2000). In this work, the process
of finding and applying the applicable actions has been parallelized, resulting
in almost linear speedup. Parallelizing the process of finding the applicable
actions, instead of ignoring most of them, as Ff
does, presents the advantage of preserving completeness; however, the cost is
that a parallel machine is required.
We
close the reference to other heuristic state-space planners with the Stan planning system (Fox & Long,
1998; Long & Fox, 1999). Stan
is not a heuristic state-space planner, at least in its basic architecture, but
a graph-based planner, which uses several pre-processing techniques for
extracting useful domain information that is exploited for more efficient graph
construction and solution extraction. However, in the Aips-00 competition a hybrid architecture was used (Long
& Fox, 2000; Fox & Long, 2001), where a heuristic state-space planning
module was employed to solve specific identified sub-problems. Thus Stan succeeded in improving its
performance, especially in cases of transportation domains.
Concerning problem
decomposition, work has been done on goal ordering (Cheng & Irani, 1989;
Drummond & Currie, 1989). Recently a similar approach has been proposed by
Koehler (1998) and has been extended by Koehler and Hoffmann (2000). This approach
automatically derives an ordering relation between the goal facts, which can be
used by any planner to search for increasing sets of subgoals. The advantage of
this approach is that no extra information is needed, except for the usual
domain definition, while the disadvantage, with respect to the XOR-constraints
approach, is that only the goal facts are taken into account in the
intermediate states that are constructed. This approach has been adopted by the
Ff planning system.
Ioannis
Refanidis
14-8-2001