ArXiv Preprint
Recognizing the types of white blood cells (WBCs) in microscopic images of
human blood smears is a fundamental task in the fields of pathology and
hematology. Although previous studies have made significant contributions to
the development of methods and datasets, few papers have investigated
benchmarks or baselines that others can easily refer to. For instance, we
observed notable variations in the reported accuracies of the same
Convolutional Neural Network (CNN) model across different studies, yet no
public implementation exists to reproduce these results. In this paper, we
establish a benchmark for WBC recognition. Our results indicate that CNN-based
models achieve high accuracy when trained and tested under similar imaging
conditions. However, their performance drops significantly when tested under
different conditions. Moreover, the ResNet classifier, which has been widely
employed in previous work, exhibits an unreasonably poor generalization ability
under domain shifts due to batch normalization. We investigate this issue and
suggest some alternative normalization techniques that can mitigate it. We make
fully-reproducible code publicly
available\footnote{\url{https://github.com/apple2373/wbc-benchmark}}.
Satoshi Tsutsui, Zhengyang Su, Bihan Wen
2023-03-03