*In Entropy (Basel, Switzerland) *

*X*and

*Y*, and their latent representation

*T*, take the form of two Markov chains T - X - Y and X - T - Y . Requiring both to hold during the optimisation process can be limiting for the set of potential joint distributions P ( X , Y , T ) . We, therefore, show how to circumvent this limitation by optimising a lower bound for the mutual information between

*T*and

*Y*: I ( T ; Y ) , for which only the latter Markov chain has to be satisfied. The mutual information I ( T ; Y ) can be split into two non-negative parts. The first part is the lower bound for I ( T ; Y ) , which is optimised in deep variational information bottleneck (DVIB) and cognate models in practice. The second part consists of two terms that measure how much the former requirement T - X - Y is violated. Finally, we propose interpreting the family of information bottleneck models as directed graphical models, and show that in this framework, the original and deep information bottlenecks are special cases of a fundamental IB model.

*Wieczorek Aleksander, Roth Volker*

*2020-Jan-22*

**Markov assumption, Markov chain, conditional independence, deep variational information bottleneck, information bottleneck, mutual information**