ArXiv Preprint
While Multiple Instance Learning (MIL) has shown promising results in digital
Pathology Whole Slide Image (WSI) classification, such a paradigm still faces
performance and generalization problems due to challenges in high computational
costs on Gigapixel WSIs and limited sample size for model training. To deal
with the computation problem, most MIL methods utilize a frozen pretrained
model from ImageNet to obtain representations first. This process may lose
essential information owing to the large domain gap and hinder the
generalization of model due to the lack of image-level training-time
augmentations. Though Self-supervised Learning (SSL) proposes viable
representation learning schemes, the improvement of the downstream task still
needs to be further explored in the conversion from the task-agnostic features
of SSL to the task-specifics under the partial label supervised learning. To
alleviate the dilemma of computation cost and performance, we propose an
efficient WSI fine-tuning framework motivated by the Information Bottleneck
theory. The theory enables the framework to find the minimal sufficient
statistics of WSI, thus supporting us to fine-tune the backbone into a
task-specific representation only depending on WSI-level weak labels. The
WSI-MIL problem is further analyzed to theoretically deduce our fine-tuning
method. Our framework is evaluated on five pathology WSI datasets on various
WSI heads. The experimental results of our fine-tuned representations show
significant improvements in both accuracy and generalization compared with
previous works. Source code will be available at
https://github.com/invoker-LL/WSI-finetuning.
Honglin Li, Chenglu Zhu, Yunlong Zhang, Yuxuan Sun, Zhongyi Shui, Wenwei Kuang, Sunyi Zheng, Lin Yang
2023-03-15