Introduction

Constraint Satisfaction Problems (CSPs) appear in many real-life applications such as scheduling, resource allocation, timetabling, vehicle routing, frequency allocation, etc. Most CSPs can be naturally and efficiently modelled using non-binary (or n-ary) constraints that may involve an arbitrary number of variables. It is well known that any non-binary CSP can be converted into an equivalent binary one. The most well-known translations are the dual encoding [Dechter PearlDechter Pearl1989] and the hidden variable encoding [Rossi, Petrie, DharRossi et al.1990]. The ability to translate any non-binary CSP into binary has been often used in the past as a justification for restricting attention to binary CSPs. Implicitly, the assumption had been that when faced with a non-binary CSP we can simply convert it into a binary one, and then apply well-known generic techniques for solving the binary equivalent. In this paper we will show that this assumption is flawed because generic techniques for binary CSPs are not suitable for binary encodings of non-binary problems.

In the past few years, there have been theoretical and empirical studies on the efficiency of binary encodings and comparisons between binary encodings and the non-binary representation [Bacchus van BeekBacchus van Beek1998,Stergiou WalshStergiou Walsh1999,Mamoulis StergiouMamoulis Stergiou2001,SmithSmith2002,Bacchus, Chen, van Beek, WalshBacchus et al.2002]. Theoretical results have showed that converting non-binary CSPs into binary equivalents is a potentially efficient way to solve certain classes of non-binary problems. However, in the (limited) empirical studies there are very few cases where this appears to be true, with Conway's ``game of Life'' [SmithSmith2002] being the most notable exception. There are various reasons for this. In many cases, the extensive space requirements of the binary encodings make them infeasible. Also, in many non-binary problems we can utilize efficient specialized propagators for certain constraints, such as the algorithm developed by Régin Regin94 for the all-different constraint. Converting such constraints into binary is clearly impractical. Another reason, which has been overlooked, is that most (if not all) experimental studies use well-known generic local consistency and search algorithms in the encodings. In this way they fail to exploit the structure of the constraints in the encodings, ending up with inefficient algorithms. To make the binary encodings a realistic choice of modelling and solving non-binary CSPs, we need algorithms that can utilize their structural properties. Finally, it is important to point out that the use of a binary encoding does not necessarily mean that we have to convert all the non-binary constraints in a problem into binary, as it is commonly perceived. If we are selective in the constraints we encode, based on properties such as arity and tightness, then we can get efficient hybrid models.

To address these issues, we show that the use of specialized arc consistency and search algorithms for binary encodings of non-binary CSPs can lead to efficient models. We consider three encodings; the dual, the hidden variable, and the double encoding. The latter, which is basically the conjunction of the other two encodings, has received little attention but may well turn out to be the most significant in practice. The aim of this study is twofold. First, to present efficient algorithms for the binary encodings and analyze them theoretically and experimentally. Second, and more importantly, to investigate if and when the use of such algorithms can help solve non-binary problems efficiently. Towards these aims, we make the following contributions:

We describe a simple algorithm that enforces arc consistency on the hidden variable encoding of an arbitrary non-binary CSP with time complexity, where is the number of constraints, the maximum arity of the constraints, and the maximum domain size. This gives an improvement compared to the asymptotic complexity of a generic arc consistency algorithm. The improved complexity is now the same as the complexity of an optimal generalized arc consistency algorithm in the non-binary representation of the problem. We also identify a property of the arc consistency algorithm for the hidden variable encoding that can make it run faster, on arc inconsistent problems, than the generalized arc consistency algorithm.
We then consider search algorithms that maintain local consistencies during search in the hidden variable encoding. We show that, like maintaining arc consistency, all the generalizations of forward checking to non-binary CSPs can be emulated by corresponding forward checking algorithms that run in the hidden variable encoding and only instantiate original variables (i.e. the variables of the initial non-binary problem). We show that each such algorithm and its corresponding algorithm for non-binary constraints have the following relationships: 1) they visit the same number of search tree nodes, and 2) the asymptotic cost of each of them is within a polynomial bound of the other.
We describe a specialized algorithm for the dual encoding that achieves arc consistency with worst-case time complexity. This is significantly lower than the $O(e^2d^{2k})$ complexity of a generic arc consistency algorithm. The improvement in the complexity bound stems from the observation that constraints in the dual encoding have a specific structure; namely they are piecewise functional [Van Hentenryck, Deville, TengVan Hentenryck et al.1992]. Apart from applying arc consistency in the dual encoding of a non-binary CSP, this algorithm can also be used as a specialized filtering algorithm for certain classes of non-binary constraints.
We adapt various search algorithms to run in the double encoding and compare them theoretically to similar algorithms for the hidden variable encoding and the non-binary representation. Search algorithms that operate in the double encoding can exploit the advantages of both the hidden variable and the dual encoding. For example, we show that, under certain conditions, the asymptotic cost of the maintaining arc consistency algorithm in the double encoding can only be polynomially worse than the asymptotic cost of the corresponding algorithm in the non-binary representation (and the hidden variable encoding), while it can be exponentially better.
Finally, we make an extensive empirical study on various domains. We consider random problems as well as structured ones, like crossword puzzle generation, configuration, and frequency assignment. This study consists of two parts. In the first part, we give experimental results that demonstrate the advantages of the specialized algorithms for binary encodings compared to generic algorithms. For example, the specialized arc consistency algorithm for the dual encoding can be orders of magnitude faster than a generic arc consistency algorithm. In the second part we show that the use of binary encodings can offer significant benefits when solving certain classes of non-binary CSPs. For example, solving the dual encoding of some configuration problems can be orders of magnitudes more efficient than solving the non-binary representation. Also, empirical results from frequency assignment - like problems demonstrate that a binary encoding can be beneficial even for non-binary constraints that are intentionally specified.

This paper is structured as follows. In Section 2 we give the necessary definitions and background. In Section 3 we describe a specialized arc consistency algorithm for the hidden variable encoding. We also demonstrate that all the extensions of forward checking to non-binary CSPs can be emulated by binary forward checking algorithms that run in the hidden variable encoding. In Section 4 we explain how the complexity of arc consistency in the dual encoding can be improved and describe a specialized arc consistency algorithm. Section 5 discusses algorithms for the double encoding. In Section 6 we present experimental results on random and structured problems that demonstrate the usefulness of the proposed algorithms. We also draw some conclusions regarding the applicability of the encodings, based on theoretical and experimental results. Section 7 discusses related work. Finally, in Section 8 we conclude.

Nikolaos Samaras 2005-11-09