In IEEE transactions on medical imaging ; h5-index 74.0
The acquisition of large-scale medical image data, necessary for training machine learning algorithms, is hampered by associated expert-driven annotation costs. Mining hospital archives can address this problem, but labels often incomplete or noisy, e.g., 50% of the lesions in DeepLesion are left unlabeled. Thus, effective label harvesting methods are critical. This is the goal of our work, where we introduce Lesion-Harvester-a powerful system to harvest missing annotations from lesion datasets at high precision. Accepting the need for some degree of expert labor, we use a small fully-labeled image subset to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator (LPG) and a very selective lesion proposal classifier (LPC). Using a new hard negative suppression loss, the resulting harvested and hard-negative proposals are then employed to iteratively finetune our LPG. While our framework is generic, we optimize our performance by proposing a new 3D contextual LPG and by using a global-local multi-view LPC. Experiments on DeepLesion demonstrate that Lesion- Harvester can discover an additional 9; 805 lesions at a precision of 90%. We publicly release the harvested lesions, along with a new test set of completely annotated DeepLesion volumes. We also present a pseudo 3D IoU evaluation metric that corresponds much better to the real 3D IoU than current DeepLesion evaluation metrics. To quantify the downstream benefits of Lesion-Harvester we show that augmenting the DeepLesion annotations with our harvested lesions allows state-of-the-art detectors to boost their average precision by 7 to 10%.
Cai Jinzheng, Harrison Adam P, Zheng Youjing, Yan Ke, Huo Yuankai, Xiao Jing, Yang Lin, Lu Le