next up previous
Next: Restricted PDAGs and Completed Up: Searching for Bayesian Network Previous: DAGs and Equivalence Classes


Restricted Acyclic Partially Directed Graphs

The scheme of representation that we will use is slightly different from the formalism of completed PDAGs. It is not necessary for each configuration of our search space (which we call restricted PDAG or RPDAG) to correspond to a different equivalence class; two different RPDAGs may correspond to the same equivalence class. The main reason for this is efficiency: by allowing an equivalence class to be represented (only in some cases) by different RPDAGs, we will gain in efficiency to explore the space. Before explaining this in greater detail, let us define the concept of RPDAG:

Definition 1 (restricted PDAG)   A PDAG $G$ is a restricted PDAG (RPDAG) if and only if it satisfies the following conditions:
1
$\forall y\in {\cal U}$, $Pa_G(y)\neq\emptyset   \Rightarrow   Ne_G(y)=\emptyset$.
2
$G$ does not contain any directed cycle.
3
$G$ does not contain any completely undirected cycle, i.e. a cycle containing only links.
4
If $x\rightarrow y$ exists in $G$ then either $\vert Pa_G(y)\vert\geq 2$ or $Pa_G(x)\neq\emptyset$.

This condition states that an arc $x\rightarrow y$ exists in $G$ only if it is either part of an h-h pattern or there is another arc (originated by an h-h pattern) going to $x$.

As an RPDAG is a PDAG, it could be considered to be a representation of a set of (equivalent) DAGs. We therefore must define which the set of DAGs is represented by a given RPDAG $G$, i.e. how direction may be given to the links in $G$ in order to extend it to a DAG. The following definition formalizes this idea.

Definition 2 (Extension of a PDAG)   Given a PDAG $G$, we say that a DAG $H$ is an extension of $G$ if and only if:
1
$G$ and $H$ have the same skeleton.
2
If $x\rightarrow y$ is an arc in $G$ then $x\rightarrow y$ is also an arc in $H$ (no arc is redirected).
3
$G$ and $H$ have the same h-h patterns (i.e. the process of directing the links in $G$ in order to produce $H$ does not generate new h-h patterns).

We will use $Ext(G)$ to denote the set of DAGs that are extensions of a given PDAG $G$.

Proposition 1   Let $G$ be an RPDAG. Then:
(a)
$Ext(G)\neq\emptyset$ ($G$ can be extended to obtain a DAG, i.e. the extension of an RPDAG is well-defined).
(b)
$\forall H,H'\in Ext(G) \;\; H\simeq H'$ (i.e. all the different DAGs that can be obtained from $G$ by extending it are equivalent).

Proof:
(a) As $G$ has no directed cycle (condition 2 in Definition 1), then either $G$ is already a DAG or it has some links. Let us consider an arbitrary link $x\mbox{-}y$. Using condition 1 in Definition 1, neither $x$ nor $y$ can have a parent. We can then direct the link $x\mbox{-}y$ in either direction without creating an h-h pattern. If we direct the link $x\mbox{-}y$ as $x\rightarrow y$ and $y$ is part of another link $y\mbox{-}z$, then we direct it as $y\rightarrow z$ (in order to avoid a new h-h pattern). We can continue directing the links in a chain in this way, and this process cannot generate a directed cycle because, according to condition 3 in Definition 1, $G$ has no completely undirected cycle.
(b) The extension process of $G$ does not modify the skeleton and does not create new h-h patterns. Therefore, all the extensions of $G$ have the same skeleton and the same v-structures (a v-structure is a particular case of h-h pattern), hence they are equivalent. ${\vcenter{\vbox{ \hrule height 0.4pt\hbox{\vrule width 0.4pt height 5pt \kern5pt\vrule width 0.4pt}\hrule height 0.4pt}}}$



It should be noted that condition 4 in Definition 1 is not necessary to prove the results in Proposition 1. This condition is included to ensure that the type of PDAG used to represent subsets of equivalent DAGs is as general as possible. In other words, condition 4 guarantees that an RPDAG is a representation of the greatest number of equivalent DAGs, subject to the restrictions imposed by conditions 1-3 in Definition 1. As we will see in the next proposition, this is achieved by directing the minimum number of edges. For example, $z\mbox{-}x\rightarrow y\rightarrow u$ would not be a valid RPDAG. The RPDAG that we would use in this case is $z\mbox{-}x\mbox{-}y\mbox{-}u$.

Proposition 2   Let $G$ be a PDAG verifying the conditions 1-3 in Definition 1. There is then a single RPDAG $R$ such that $Ext(G)\subseteq Ext(R)$.

Proof:
The proof is constructive. We shall build the RPDAG $R$ as follows: the skeleton and the h-h patterns of $R$ are the same as those in $G$. An arc $x\rightarrow y$ in $G$ shall now be considered such that $Pa_G(x)=\emptyset$ and $Pa_G(y)=\{x\}$ (if such an arc does not exist, then $G$ itself would be an RPDAG): we convert the arc $x\rightarrow y$ into the link $x\mbox{-}y$. This process is then repeated. Obviously, the PDAG $R$ obtained in this way has no directed cycle and verifies condition 4 in Definition 1. Moreover, we cannot obtain a configuration $z\rightarrow x\mbox{-}y$ as a subgraph of $R$ because $Pa_G(x)=\emptyset$ (we only remove the direction of arcs whose initial nodes have no parent). In addition, $R$ cannot have any completely undirected cycle because either the arc $x\rightarrow y$ is not part of any cycle in $G$ or it is part of a cycle in $G$ that must contain at least one h-h pattern (and the directions of the arcs in this pattern will never be removed). $R$ is therefore an RPDAG.

Let us now prove that $Ext(G)\subseteq Ext(R)$: if $H\in Ext(G)$ then $H$ and $G$ have the same skeleton and h-h patterns, hence $H$ and $R$ also have the same skeleton and h-h patterns. Moreover, as all the arcs in $R$ are also arcs in $G$, if $x\rightarrow y\in R$ then $x\rightarrow y\in G$, which in turn implies that $x\rightarrow y\in H$. Therefore, according to Definition 2, $H\in Ext(G)$.

Finally, let us prove the uniqueness of $R$: we already know that any other RPDAG $R^{\prime}$ verifying that $Ext(G)\subseteq Ext(R^{\prime})$ has the same skeleton and h-h patterns as $R$. According to condition 1 in Definition 1, the edges that are not part of any of these h-h patterns but are incident to the middle node $y$ in any h-h pattern $x\rightarrow y\leftarrow z$ must be directed away from $y$ (in order to avoid new h-h patterns). The remaining edges that are not part of any h-h pattern must be undirected, in order to satisfy condition 4 in Definition 1. There is therefore only one RPDAG that matches a given skeleton and a set of h-h patterns, so $R$ is the only RPDAG verifying that $Ext(G)\subseteq Ext(R)$. Figure 6 shows an example of the construction process. ${\vcenter{\vbox{ \hrule height 0.4pt\hbox{\vrule width 0.4pt height 5pt \kern5pt\vrule width 0.4pt}\hrule height 0.4pt}}}$

Figure 6: Illustrating the construction process in Proposition 2: (a) PDAG $G$; (b) undirecting the arc $a\rightarrow y$; (c) undirecting the arc $y\rightarrow x$; (d) undirecting the arc $x\rightarrow e$, thus obtaining the RPDAG $R$
\begin{figure} % latex2html id marker 227 \centerline{\psfig{figure=./figuras/miejprop2bis.eps,height=.32\textwidth}}\end{figure}

The following proposition ensures that the concept of RPDAG allows us to define a partition in the space of DAGs.

Proposition 3   Let $G$ and $G^{\prime}$ be two different RPDAGs. Then $Ext(G)\cap Ext(G^{\prime})=\emptyset$.

Proof:
Let $H$ be any DAG. Then $H$ itself is a PDAG and obviously $H=Ext(H)$. By applying the result in Proposition 2, we can assert that there is a single RPDAG $G$ such that $H\subseteq Ext(G)$. ${\vcenter{\vbox{ \hrule height 0.4pt\hbox{\vrule width 0.4pt height 5pt \kern5pt\vrule width 0.4pt}\hrule height 0.4pt}}}$



In the proposition below, we show the properties which are common to all the DAGs belonging to the same extension of an RPDAG.

Proposition 4   Two DAGs belong to the extension of the same RPDAG if and only if they have the same skeleton and the same h-h patterns.

Proof:
The necessary condition is obvious. Let us prove the sufficient condition: let $H$ and $H^{\prime}$ be two DAGs with common skeleton and h-h patterns. We shall construct a PDAG $G$ as follows: the skeleton and the h-h patterns of $G$ are the same as those in $H$ and $H^{\prime}$; the edges that have the same orientation in $H$ and $H^{\prime}$ are directed in $G$ in the same way; the other edges in $G$ remain undirected. From Definition 2, it is clear that $H,H^{\prime}\in Ext(G)$.

$G$ has no directed cycles because $H$ and $H^{\prime}$ are DAGs. $G$ has no completely undirected cycles, since all the cycles in $H$ and $H^{\prime}$ share at least the h-h patterns. In addition, $x\rightarrow y\mbox{-}z$ cannot be a subgraph of $G$ because this would imply the existence of the subgraphs $x\rightarrow y\leftarrow z$ and $x\rightarrow y\rightarrow z$ in $H$ and $H^{\prime}$, respectively, and therefore these two DAGs would not have the same h-h patterns.

Therefore, the PDAG $G$ satisfies conditions 1-3 in Definition 1. By applying Proposition 2, we can then build a single RPDAG $R$ such that $Ext(G)\subseteq Ext(R)$, hence $H,H^{\prime}\in Ext(R)$. ${\vcenter{\vbox{ \hrule height 0.4pt\hbox{\vrule width 0.4pt height 5pt \kern5pt\vrule width 0.4pt}\hrule height 0.4pt}}}$



A characterization of the extension of an RPDAG that will be useful later is:

Proposition 5   Given an RPDAG $G$ and a DAG $H$, then $H$ is an extension of $G$ if and only if the following conditions hold:
1
$G$ and $H$ have the same skeleton.
2
$\forall y\in {\mathcal{U}}$, if $Pa_G(y)\neq\emptyset$ then $Pa_H(y)=Pa_G(y)$.
3
$\forall y\in {\mathcal{U}}$, if $Pa_G(y)=\emptyset$ and $Pa_H(y)\neq\emptyset$ then $\vert Pa_H(y)\vert=1$.

Proof:
$\bullet$ Necessary condition:

- $Pa_G(y)\neq\emptyset$: Let $x\in Pa_G(y)$, i.e., $x\rightarrow y\in G$. Then, from condition 2 in Definition 2, $x\rightarrow y\in H$, i.e., $x\in Pa_H(y)$. Moreover, $z\in Pa_G(y) \Leftrightarrow$ $x\rightarrow y\leftarrow z$ is an h-h pattern in $G$. From condition 3 in Definition 2, this occurs if and only if $x\rightarrow y\leftarrow z$ is an h-h pattern in $H$, which is equivalent to $z\in Pa_H(y)$. Therefore, $Pa_H(y)=Pa_G(y)$.

- $Pa_G(y)=\emptyset$ and $Pa_H(y)\neq\emptyset$: Let $x\in Pa_H(y)$. If there is another node $z\in Pa_H(y)$, then $x\rightarrow y\leftarrow z$ is an h-h pattern in $H$ and therefore it is also an h-h pattern in $G$, which contradicts the fact that $Pa_G(y)=\emptyset$. So, $y$ cannot have more than one parent in $H$, hence $\vert Pa_H(y)\vert=1$.



$\bullet$ Sufficient condition:

- If $x\rightarrow y\in G$ then $Pa_G(y)\neq\emptyset$. From condition 2 we have $Pa_H(y)=Pa_G(y)$, hence $x\rightarrow y\in H$.

- If $x\rightarrow y\leftarrow z$ is an h-h pattern in $G$, once again from condition 2, $Pa_H(y)=Pa_G(y)$ and therefore $x\rightarrow y\leftarrow z$ is an h-h pattern in $H$.

- If $x\rightarrow y\leftarrow z$ is an h-h pattern in $H$, then $\vert Pa_H(y)\vert\neq1$ and $Pa_H(y)\neq\emptyset$. So, from condition 3, we obtain $Pa_G(y)\neq\emptyset$ and, from condition 2, $Pa_G(y)=Pa_H(y)$. Therefore $x\rightarrow y\leftarrow z$ is an h-h pattern in $G$. ${\vcenter{\vbox{ \hrule height 0.4pt\hbox{\vrule width 0.4pt height 5pt \kern5pt\vrule width 0.4pt}\hrule height 0.4pt}}}$



Subsections
next up previous
Next: Restricted PDAGs and Completed Up: Searching for Bayesian Network Previous: DAGs and Equivalence Classes
Luis Miguel de Campos Ib��ez 2003-05-30