In IEEE transactions on pattern analysis and machine intelligence ; h5-index 127.0
Many learning algorithms such as kernel machines, nearest neighbors, clustering, or anomaly detection, are based on distances or similarities. Before similarities are used for training an actual machine learning model, we would like to verify that they are bound to meaningful patterns in the data. In this paper, we propose to make similarities interpretable by augmenting them with an explanation. We develop BiLRP, a scalable and theoretically founded method to systematically decompose the output of an already trained deep similarity model on pairs of input features. Our method can be expressed as a composition of LRP explanations, which were shown in previous works to scale to highly nonlinear models. Through an extensive set of experiments, we demonstrate that BiLRP robustly explains complex similarity models, e.g. built on VGG-16 deep neural network features. Additionally, we apply our method to an open problem in digital humanities: detailed assessment of similarity between historical documents such as astronomical tables. Here again, BiLRP provides insight and brings verifiability into a highly engineered and problem-specific similarity model.
Eberle Oliver, Buttner Jochen, Krautli Florian, Mueller Klaus-Robert, Valleriani Matteo, Montavon Gregoire