Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In Bioinformatics (Oxford, England)

MOTIVATION : Deep learning attained excellent results in Digital Pathology recently. A challenge with its use is that high quality, representative training data sets are required to build robust models. Data annotation in the domain is labor intensive and demands substantial time commitment from expert pathologists. Active Learning (AL) is a strategy to minimize annotation. The goal is to select samples from the pool of unlabeled data for annotation that improves model accuracy. However, AL is a very compute demanding approach. The benefits for model learning may vary according to the strategy used, and it may be hard for a domain specialist to fine tune the solution without an integrated interface.

RESULTS : We developed a framework that includes a friendly user interface along with run-time optimizations to reduce annotation and execution time in AL in digital pathology. Our solution implements several AL strategies along with our Diversity-Aware Data Acquisition (DADA) acquisition function, which enforces data diversity to improve the prediction performance of a model. In this work, we employed a model simplification strategy (Network Auto-Reduction (NAR)) that significantly improves AL execution time when coupled with DADA. NAR produces less compute demanding models, which replace the target models during the AL process to reduce processing demands. An evaluation with a Tumor-Infiltrating Lymphocytes (TILs) classification application shows that: (i) DADA attains superior performance compared to state-of-the-art AL strategies for different Convolutional Neural Networks (CNNs), (ii) NAR improves the AL execution time by up to 4.3 ×, and (iii) target models trained with patches/data selected by the NAR reduced versions achieve similar or superior classification quality to using target CNNs for data selection.

AVAILABILITY : Source code:

SUPPLEMENTARY INFORMATION : Supplementary data are available at Bioinformatics online.

Meirelles André L S, Kurc Tahsin, Kong Jun, Ferreira Renato, Saltz Joel, Teodoro George