Due to the nature of deep learning approaches, it is inherently difficult to
understand which aspects of a molecular graph drive the predictions of the
network. As a mitigation strategy, we constrain certain weights in a multi-task
graph convolutional neural network according to the Gini index to maximize the
"inequality" of the learned representations. We show that this constraint does
not degrade evaluation metrics for some targets, and allows us to combine the
outputs of the graph convolutional operation in a visually interpretable way.
We then perform a proof-of-concept experiment on quantum chemistry targets on
the public QM9 dataset, and a larger experiment on ADMET targets on proprietary
drug-like molecules. Since a benchmark of explainability in the latter case is
difficult, we informally surveyed medicinal chemists within our organization to
check for agreement between regions of the molecule they and the model
identified as relevant to the properties in question.
Ryan Henderson, Djork-Arné Clevert, Floriane Montanari