In Bioinformatics (Oxford, England)
MOTIVATION : The identification and discovery of phenotypes from high content screening (HCS) images is a challenging task. Earlier works use image analysis pipelines to extract biological features, supervised training methods or generate features with neural networks pretrained on non-cellular images. We introduce a novel unsupervised deep learning algorithm to cluster cellular images with similar Mode-of-Action (MOA) together using only the images' pixel intensity values as input. It corrects for batch effect during training. Importantly, our method does not require the extraction of cell candidates and works from the entire images directly.
RESULTS : The method achieves competitive results on the labelled subset of the BBBC021 dataset with an accuracy of 97.09% for correctly classifying the MOA by nearest neighbors matching. Importantly, we can train our approach on unannotated datasets. Therefore, our method can discover novel MOAs and annotate unlabelled compounds. The ability to train end-to-end on the full resolution images makes our method easy to apply and allows it to further distinguish treatments by their effect on proliferation.
AVAILABILITY : Our code is available at https://github.com/Novartis/UMM-Discovery.
SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.
Janssens Rens, Zhang Xian, Kauffmann Audrey, de Weck Antoine, Durand Eric Y