next up previous
Next: Metric Assumption 2: Do Up: Performance Metrics Previous: Performance Metrics

Metric Assumption 1: Does Performance Vary between Planners When Run on Different Hardware Platforms?

Often when a planner is run at a competition or in someone else's lab, the hardware and software platforms differ from the platform used during development. Clearly, slowing down the processor speed should slow down planning, requiring higher cut-offs. Reduction in memory may well change the set of problems that can be solved or increase the processing time due to increased swapping. Changing the hardware configuration may change the way memory is cached and organized, favoring some planners' internal representations over others. Changing compilers could also affect the amount and type of optimizations in the code. The exact effects are probably unknown. The assumption is that such changes affect all planners more or less equally.

To test this, we ran the planners on a less powerful, lower memory machine and compared the results on the two platforms: the base Sun Ultrasparc 10/440 with 256mb of memory and Ultrasparc 1/170 with 128mb of memory. The operating system and compilers were the same versions for both machines. The same problems were run on both platforms. We followed much the same methodology as in the comparison of planner versions: comparing on both number of problems solved and time to solution. Table 11 shows the results as measured by problems solved, failed or timed-out for each planner on the two platforms.


Table 11: Number of problems solved, failed and timed-out for each planner on the two hardware platforms. Last column is the percentage reduction in the number solved from the faster to slower platforms.
Planner Platform Solved Failed Timed-Out $\chi ^2$ p % Reduction
A Ultra 1 94 383 27      
  Ultra 10 95 389 20 1.09 .58 1
B Ultra 1 121 346 37      
  Ultra 10 121 353 30 0.80 .67 0
C Ultra 1 354 7 143      
  Ultra 10 367 7 130 0.85 .65 4
D Ultra 1 218 59 227      
  Ultra 10 217 59 228 0.01 .998 -.4
E Ultra 1 280 145 79      
  Ultra 10 284 150 70 0.66 .72 1
F Ultra 1 277 155 72      
  Ultra 10 284 154 66 0.35 .84 2
G Ultra 1 120 347 37      
  Ultra 10 121 352 31 0.57 .75 1
H Ultra 1 116 350 38      
  Ultra 10 122 338 44 0.80 .67 7
I Ultra 1 265 201 38      
  Ultra 10 274 201 29 1.36 .51 3
J Ultra 1 280 220 4      
  Ultra 10 285 217 2 0.73 .69 2
K Ultra 1 108 370 26      
  Ultra 10 108 368 28 0.08 .96 0
L Ultra 1 149 339 16      
  Ultra 10 150 341 13 0.32 .85 1
M Ultra 1 250 65 189      
  Ultra 10 258 66 180 0.35 .84 3


As before, we also looked at change in time to solution. Table 12 shows how the time to solution changes for each planner. Not surprisingly, faster processor and more memory nearly always lead to better performance. Somewhat surprisingly, the difference is far less than the doubling that might be expected; the mean differences are much less than the mean times on the faster processor (see Table 10 for the mean solution times).


Table 12: Improvements in execution speed moving from slower to faster platform. Counts only problems that were solved on both platforms. For faster and slower, the mean and standard deviation (Sd) of difference is also provided.
Planner Faster Slower Same Total
  # Mean $\Delta$ Sd $\Delta$ # Mean $\Delta$ Sd $\Delta$    
A 92 5.18 30.76 1     1 94
B 120 4.02 10.01 0     1 121
C 294 31.89 101.71 60 0.29 0.14 0 354
D 177 11.02 82.82 39 0.23 0.14 1 217
E 275 2.68 12.27 1     4 280
F 271 14.86 72.44 0     6 277
G 117 5.02 17.17 1     2 120
H 115 6.86 25.24 0     1 116
I 261 25.73 119.97 0     4 265
J 280 42.24 138.16 0     0 280
K 107 15.26 75.42 0     1 108
L 148 16.81 98.54 1     0 149
M 194 32.72 139.73 56 0.30 0.18 0 250


Also, the effect seems to vary between the planners. Based on the counts, the Lisp-based planners appear to be less susceptible to this trend (the only ones that sometimes were faster on the slower platform). However, the advantages are very small, affecting primarily the smaller problems. We think that this effect is due to the need to load in a Lisp image at startup from a centralized server; thus, computation time for small problems will be dominated by any network delay. Older versions of planners appear to be less sensitive to the switch in platform.

In this study, the platforms make little difference to the results, despite a more than doubling of processor speed and doubling of memory. However, the two platforms are underpowered when compared to the development platforms for some of the planners. We chose these platforms because they differed in only a few characteristics (processor speed and memory amount) and because we had access to 20 identically configured machines. To really observe a difference, 1GB9 of memory or more may be needed.

Recent trends in planning technology have exploited cheap memory: translations to propositional representations, compilation of the problems and built-in caching and memory management techniques. Thus, some planners are designed to trade-off memory for time; these planners will understandably be affected by memory limitations for some problems. Given the results of this study, we considered performing a more careful study of memory by artificially limiting memory for the planners but did not do so because we did not have access to enough sufficiently large machines to likely make a difference and because we could not devise a scheme for fairly doing so across all the planners (which are implemented in different languages and require different software run-time environments).

Another important factor may be memory architecture/management. Some planners include their own memory managers, which map better to some hardware platforms than to others (e.g., HSP uses a linear organization that appears to fit well with Intel's memory architecture).


next up previous
Next: Metric Assumption 2: Do Up: Performance Metrics Previous: Performance Metrics
©2002 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.