In Journal of pathology informatics ; h5-index 23.0
In digital pathology, deep learning has been shown to have a wide range of applications, from cancer grading to segmenting structures like glomeruli. One of the main hurdles for digital pathology to be truly effective is the size of the dataset needed for generalization to address the spectrum of possible morphologies. Small datasets limit classifiers' ability to generalize. Yet, when we move to larger datasets of whole slide images (WSIs) of tissue, these datasets may cause network bottlenecks as each WSI at its original magnification can be upwards of 100 000 by 100 000 pixels, and over a gigabyte in file size. Compounding this problem, high quality pathologist annotations are difficult to obtain, as the volume of necessary annotations to create a classifier that can generalize would be extremely costly in terms of pathologist-hours. In this work, we use Active Learning (AL), a process for iterative interactive training, to create a modified U-net classifier on the region of interest (ROI) scale. We then compare this to Random Learning (RL), where images for addition to the dataset for retraining are randomly selected. Our hypothesis is that AL shows benefits for generating segmentation results versus randomly selecting images to annotate. We show that after 3 iterations, that AL, with an average Dice coefficient of 0.461, outperforms RL, with an average Dice Coefficient of 0.375, by 0.086.
Folmsbee Jonathan, Zhang Lei, Lu Xulei, Rahman Jawaria, Gentry John, Conn Brendan, Vered Marilena, Roy Paromita, Gupta Ruta, Lin Diana, Samankan Shabnam, Dhorajiva Pooja, Peter Anu, Wang Minhua, Israel Anna, Brandwein-Weber Margaret, Doyle Scott
2022
Active learning, Computational pathology, Digital pathology, Oral cavity cancer, Region of interest, Semantic segmentation, U-net, Whole slide imaging