Disentangled representation learning has undoubtedly benefited from objective
function surgery. However, a delicate balancing act of tuning is still required
in order to trade off reconstruction fidelity versus disentanglement. Building
on previous successes of penalizing the total correlation in the latent
variables, we propose TCWAE (Total Correlation Wasserstein Autoencoder).
Working in the WAE paradigm naturally enables the separation of the
total-correlation term, thus providing disentanglement control over the learned
representation, while offering more flexibility in the choice of reconstruction
cost. We propose two variants using different KL estimators and perform
extensive quantitative comparisons on data sets with known generative factors,
showing competitive results relative to state-of-the-art techniques. We further
study the trade off between disentanglement and reconstruction on
more-difficult data sets with unknown generative factors, where the flexibility
of the WAE paradigm in the reconstruction term improves reconstructions.
Benoit Gaujac, Ilya Feige, David Barber