Abstract
We propose a split-merge Markov chain algorithm to address the
problem of inefficient sampling for conjugate Dirichlet process mixture
models. Traditional Markov chain Monte Carlo methods for Bayesian mixture
models, such as Gibbs sampling, can become trapped in isolated modes
corresponding to an inappropriate clustering of data points. This article
describes a Metropolis-Hastings procedure that can escape such local modes
by splitting or merging mixture components. Our Metropolis-Hastings
algorithm employs a new technique in which an appropriate proposal for
splitting or merging components is obtained by using a restricted Gibbs
sampling scan. We demonstrate empirically that our method outperforms the
Gibbs sampler in situations where two or more components are similar in
structure.