Planner Assumption 1: Is the Latest Version the Best?

Next: Planner Assumption 2: Do Up: Planners Previous: Planners

Planner Assumption 1: Is the Latest Version the Best?

In this study, we compared performance of multiple versions of four planners (labeled for this section with W, X, Y and Z, with larger version numbers indicating subsequent versions). We considered two criteria for improvement: outcome of planning and computation time for solved problems. The outcome of planning is one of: solved, failed or timed-out. On each criterion, we statistically analyzed the data for superior performance of one of the versions. The outcome results for all the planners are summarized in Table 7. As the table shows, rarely does a new version result in more problems being solved. Only Z improved the number of our test problems solved in subsequent versions.

Table 7: Version performance: counts of outcome and change in number solved.

Planner	Version	Solved	Failed	Timeout	$\Delta$ Solved?
W	1	286	664	533
W	2	255	1082	147	$\Downarrow$
X	1	502	973	3
X	2	441	940	103	$\Downarrow$
Y	1	387	750	339
Y	2	382	771	329	$\Downarrow$
Z	1	240	1043	201
Z	2	276	959	248	$\Uparrow$
Z	3	268	963	252	$\Downarrow$
Z	4	421	878	184	$\Uparrow$

To check for whether the differences in outcome are significant, we ran 2x3 $\chi ^2$ tests with planner version as independent variable and outcome as dependent. Table 8 summarizes the results of the $\chi ^2$ analysis. For Z, we compared each version to its successor only. The differences are significant except for Y and the transition from Z 2 to 3 (this was expected because these two versions were extremely similar).

Table 8: $\chi ^2$ results comparing versions of the same planner.

	old	new
Planner	Version	Version	$\chi ^2$	P
W	1	2	320.96	.0001
X	1	2	98.84	.0001
Y	1	2	.46	.79
Z	1	2	10.96	.004
Z	2	3	.158	.924
Z	3	4	48.50	.0001

Another planner performance metric, which we evaluated, was the speed of solution. For this analysis, we limited the comparison to just those problems that were solved by both versions of the planner. We then classified each problem by whether the later version solved the problem faster, slower, or in the same time as the preceding version. From the results in Table 9, we see that all of the planners improved in the average speed of solution for subsequent versions, with the exception of Z (transition from the 1 to 2 versions). However, Z did increase the number of problems solved between those versions.

Table 9: Improvements in execution speed across versions. The Faster column counts the number of cases in which the new version solved the problem faster; Slower specifies those cases in which the new version took longer to solve a given problem.

Planner	Old	New	Faster	Slower	Same	Total
W	1	2	161	61	30	252
X	1	2	295	126	0	421
Y	1	2	222	82	53	357
Z	1	2	84	121	30	235
Z	2	3	131	84	53	268
Z	3	4	115	92	21	228

Next: Planner Assumption 2: Do Up: Planners Previous: Planners