Lower Bound Transformations

Next: Appendix C. Convexity Up: Appendix B. Optimization of Previous: Upper Bound Transformations

Lower Bound Transformations

Mimicking the case of upper bounds, we replace individual conditional probabilities of the findings with lower-bounding transformations, resulting in a lower-bounding expression . Taking the product with P(d) and marginalizing over d yields a lower bound on the likelihood:

We wish to maximize with respect to the variational parameters q to obtain the tightest possible bound.

Our problem can be mapped onto a standard optimization problem in statistics. In particular, treating d as a latent variable, f as an observed variable, and q as a parameter vector, the optimization of (or its logarithm) can be viewed as a standard maximum likelihood estimation problem for a latent variable model. It can be solved using the EM algorithm (Dempster, Laird, & Rubin, 1977). The algorithm yields a sequence of variational parameters that monotonically increase the objective function . Within the EM framework, we obtain an update of the variational parameters by maximizing the expected complete log-likelihood:

where denotes the vector of variational parameters before the update, where the constant term is independent of the variational parameters q and where the expectation is with respect to the posterior distribution . Since the variational parameters associated with the conditional probabilities are independent of one another, we can maximize each term in the above sum separately. Recalling the form of the variational transformation (see Eq. (24)), we have:

displaymath1319

which we are to maximize with respect to while keeping the expectations fixed. This optimization problem can be solved iteratively and monotonically by performing the following synchronous updates with normalization:

displaymath1320

where f' denotes the derivative of f. (The update is guaranteed to be non-negative).

This algorithm can be easily extended to handle the case where not all the positive findings have been transformed. The only new feature is that some of the conditional probabilities in the products and have been left intact, i.e., not transformed; the optimization with respect to the variational parameters corresponding to the transformed conditionals proceeds as before.

Next: Appendix C. Convexity Up: Appendix B. Optimization of Previous: Upper Bound Transformations

Michael Jordan
Sun May 9 16:22:01 PDT 1999