ArXiv Preprint
The variation in histologic staining between different medical centers is one
of the most profound challenges in the field of computer-aided diagnosis. The
appearance disparity of pathological whole slide images causes algorithms to
become less reliable, which in turn impedes the wide-spread applicability of
downstream tasks like cancer diagnosis. Furthermore, different stainings lead
to biases in the training which in case of domain shifts negatively affect the
test performance. Therefore, in this paper we propose MultiStain-CycleGAN, a
multi-domain approach to stain normalization based on CycleGAN. Our
modifications to CycleGAN allow us to normalize images of different origins
without retraining or using different models. We perform an extensive
evaluation of our method using various metrics and compare it to commonly used
methods that are multi-domain capable. First, we evaluate how well our method
fools a domain classifier that tries to assign a medical center to an image.
Then, we test our normalization on the tumor classification performance of a
downstream classifier. Furthermore, we evaluate the image quality of the
normalized images using the Structural similarity index and the ability to
reduce the domain shift using the Fr\'echet inception distance. We show that
our method proves to be multi-domain capable, provides the highest image
quality among the compared methods, and can most reliably fool the domain
classifier while keeping the tumor classifier performance high. By reducing the
domain influence, biases in the data can be removed on the one hand and the
origin of the whole slide image can be disguised on the other, thus enhancing
patient data privacy.
Martin J. Hetz, Tabea-Clara Bucher, Titus J. Brinker
2023-01-23