In this section we illustrate how Grt exploits XOR-constraints within the
pre-processing phase, in order to avoid local optimal states. Specifically,
using them Grt manages to
establish new ordered subgoals that have to be achieved before achieving the
original goals. These subgoals are grouped into ordered intermediate states,
thus the original difficult problem is decomposed in a sequence of easier
subproblems that have to be solved sequentially.
We will present the
steps of the problem decomposition process through the example of Figure 3, a
4x4 grid problem with two keys (K1
and K2) and two robots (R1 and R2).
|
Initial State |
|
|
Goal State |
||||||
3 |
|
|
|
K2 |
|
3 |
R2 |
K2 |
|
|
2 |
|
|
R2 |
|
|
2 |
|
|
|
|
1 |
|
|
|
|
|
1 |
|
K1 |
|
|
0 |
|
R1 |
|
K1 |
|
0 |
R1 |
|
|
|
|
0 |
1 |
2 |
3 |
|
|
0 |
1 |
2 |
3 |
Figure 3: A 4x4 grid problem.
For this domain the
following XOR-constraints can be defined:
( ( xor ( at ?Robot * ) ) ( robot
?Robot ) )
( ( xor ( at ?Key * ) ( holding ?Key ) ) ( key
?Key ) )
The above definitions
have four ground instantiations, one for each Robot and one for each Key.
Henceforth the notation XOROBJ will refer to the ground
XOR-constraint concerning object OBJ.
The first information that can
be extracted is pairs of facts, one from the initial state and one from the
goals, which belong to the same ground XOR-constraint. For the problem of
Figure 3 the following pairs can be identified:
XORR1: (at R1 n1_0) - (at R1 n0_0)
XORR2: (at R2 n2_2) - (at R2 n0_3)
XORK1: (at K1 n3_0) - (at K1 n1_1)
XORK2: (at K2 n3_3) - (at K2
n1_3)
The original Grt planner did not store information
about the inverted actions, which achieved the various facts in the heuristic
construction phase. However, in order to exploit the XOR-constraints, this
information has to be stored. By storing these actions, the table structure
used by the Grt heuristic is
transformed to a directed acyclic graph. We call this structure Greedy
Regression Graph or simply GRG.
The nodes of this
graph are labeled with the facts of the problem. Each node retains also the
estimated distance between its fact and the goals and the corresponding
related facts. It retains also the name of the inverted action that achieved
its fact. The arcs that point to a node originate from the nodes of the
preconditions of the inverted action that achieved the node's fact. Figure 4
shows part of the GRG structure for the 4x4 grid problem (the related
facts are omitted).
Based on GRG, for
every ground XOR-constraint, a sequence of actions which is able to transform
the initial state fact to the corresponding goal state fact can be derived. We
are interested only in the actions that change the XOR-constraint's facts and
not in actions that provide auxiliary preconditions. For the problem of Figure
3, the actions' sequences are shown in Table 4:
XOR |
Initial State |
Goal State |
Sequences of actions |
|
|
|
|
XORR1 |
(at R1
n1_0) |
(at R1
n0_0) |
(move R1 n1_0 n0_0) |
XORR2 |
(at R2
n2_2) |
(at R2
n0_3) |
(move R2 n2_2 n2_3)
(move R2 n2_3 n1_3) |
XORK1 |
(at K1
n3_0) |
(at K1
n1_1) |
(get R1 K1 n3_0)
(leave R1 K1 n1_1) |
XORK1 |
(at K2
n3_3) |
(at K2
n1_3) |
(get R2 K2 n3_3) (leave
R2 K2 n1_3) |
Table 4: Sequences of actions that transform
the initial state facts
to the corresponding goal facts.
Checking
the preconditions of the above actions, we can find facts that are members of
foreign XOR-constraints. These facts are subgoals that have to be temporarily
established, before achieving the original goals, in the forward search phase.
In Table 4, the actions (get R1 K1 n3_0)
and (leave R1 K1 n1_1) of the XORK1
sequence have (at R1 n3_0) and (at R1 n1_1) as preconditions
respectively, which are members of the XORR1 relation. Similarly,
the actions (get R2 K2 n3_3) and (leave R2 K2 n1_3) of the XORK2
sequence have (at R2 n3_3) and (at R2 n1_3) as preconditions
respectively, which are members of the XORR2 relation.
There are two types of
subgoals. These are the XOR-constrained facts that are either:
(I)
preconditions
of a ground action in a foreign XOR sequence, or
(II)
add-effects
of an action, in their own XOR sequence, which has a foreign precondition.
From the identified subgoals, we can
construct a graph, conjoining the new subgoals with arcs that denote ordering
constraints, using the following rules:
1.
All
the subgoals are ordered after their initial state fact and before their goal
fact (if any).
2.
Subgoals
of type (II) that are members of the same XOR-constraint are ordered according
to the ordering of their actions.
3.
Subgoals
of type (I) are ordered together with the corresponding subgoals of type (II),
which have resulted by the same action.
4.
For a
specific XOR-constraint, subgoals of type (I) are ordered before the subgoals
of type (II).
We call the resulted
graph the ordering graph of the
problem, since it denotes the order in which the subgoals have to be achieved.
Figure 5 shows the ordering graph for
the problem of Figure 3. Lines with arcs denote ordering constraints.
Double-lines without arcs denote that the two facts are ordered together.
Proposition 8. The ordering graph is an acyclic
graph.
Proof sketch: The proof can be based on the way
in which the facts are achieved in the Pre-Processing Algorithm (Section 2.4). Actually, facts are achieved in a specific time
order (in case where a fact has been re-achieved with smaller cost, we consider
the last time it has been achieved). We define the ordering relation <
between facts, denoting that a fact has been achieved before another in the
Pre-Processing Algorithm. Similarly we define the £ relation.
Ordering relations between the subgoals originate in two ways. Firstly, subgoals of type (II) of the same XOR-constraint are ordered explicitly to each other, according to the time they have been achieved (in Figure 5 these ordering relations are denoted with non-dashed lines with arcs). Secondly, each subgoal of type (I) is ordered before than or at least at the same time with the previous one of its corresponding type (II) subgoal (in Figure 5 these ordering relations are denoted with dashed lines with arcs). Using the above equivalences, we can transform the ordering graph to an equivalent time-ordering graph. Since a time-ordering relation cannot include cycles, the same happens for the ordering graph. n
The
ordering graph makes it possible to construct intermediate, possibly
incomplete, states, which have to be achieved sequentially. Starting from the
initial state, Grt attempts to
insert one subgoal from each XOR-constraint in each intermediate state. This
fact must have the following properties:
§
It has
not been inserted in a previous intermediate state,
§
it is
not ordered after some other fact of the same XOR-constraint that has not yet been
inserted in a previous intermediate state, and finally
§
it is
not ordered together with a fact of another XOR-constraint that cannot be
inserted in the current intermediate state.
In case where there are more than one facts with the above properties for a single XOR-constraint, the selection among them is done arbitrarily. Finally, in case where no fact with the above properties exists for an XOR-constraint, the intermediate state is left incomplete.
Corollary 4. It is always possible to construct
the intermediate states.
Corollary
4 follows from Proposition 8. Since the ordering graph is a directed acyclic
graph, it is always possible to find at least one subgoal to be included in the
next intermediate state. The number of subgoals is an upper bound for the
number of the intermediate states that will be constructed.
From the ordering
graph of Figure 5, the following intermediate states can be extracted:
Intermediate state 1: ( (at R1 n3_0) (at R2 n3_3) (in K1 R1) (in K2
R2) )
Intermediate state 2:
( (at R1 n1_1) (at R2
n1_3) (at K1 n1_1) (at K2 n1_3) )
Intermediate state 3: ( (at R1 n0_0) (at R2
n0_3) (at K1n1_1) (at K2 n1_3) )
where the last state is the goal
state.
After the construction
of the intermediate states, the planner has to solve three sub-problems, which
are easier than the original one; thus, the overall time to solve them is
shorter than the time needed to solve the original problem. Note, however, that
this decomposition may lead to loss of completeness. In domains where no deadlock
exists, some solutions may be pruned. In domains where deadlocks do exist, the
decomposition may produce unsolvable sub-problems. In order to maintain
completeness, the algorithm should backtrack to all the possible inverted
actions that could achieve the facts in the Pre-Processing Algorithm, even
those with large application costs. However, due to the combinatorial explosion
problem, this approach is not adopted by Grt.
A usual situation is
the case where the sub-problems need further decomposition. This situation
arises in two cases. The first is when two objects need each other to achieve
their goals, as in the case of grid domain, with the keys and the robot,
and the second case is when there is a sequential interaction between three or
more objects. In these cases, the ordering graph of the initial problem encodes
one aspect of the interaction, while the ordering graphs of the sub-problems
encode other aspects. However, in order to avoid infinite decompositions, a
cutoff level is defined.
Ioannis
Refanidis
14-8-2001