Sparse models are desirable for many applications across diverse domains as
they can perform automatic variable selection, aid interpretability, and
provide regularization. When fitting sparse models in a Bayesian framework,
however, analytically obtaining a posterior distribution over the parameters of
interest is intractable for all but the simplest cases. As a result
practitioners must rely on either sampling algorithms such as Markov chain
Monte Carlo or variational methods to obtain an approximate posterior. Mean
field variational inference is a particularly simple and popular framework that
is often amenable to analytically deriving closed-form parameter updates. When
all distributions in the model are members of exponential families and are
conditionally conjugate, optimization schemes can often be derived by hand.
Yet, I show that using standard mean field variational inference can fail to
produce sensible results for models with sparsity-inducing priors, such as the
spike-and-slab. Fortunately, such pathological behavior can be remedied as I
show that mixtures of exponential family distributions with non-overlapping
support form an exponential family. In particular, any mixture of a diffuse
exponential family and a point mass at zero to model sparsity forms an
exponential family. Furthermore, specific choices of these distributions
maintain conditional conjugacy. I use two applications to motivate these
results: one from statistical genetics that has connections to generalized
least squares with a spike-and-slab prior on the regression coefficients; and
sparse probabilistic principal component analysis. The theoretical results
presented here are broadly applicable beyond these two examples.