NeurIPS 2022 Workshop: Self-Supervised Learning - Theory and
Practice
This paper presents a novel positive and negative set selection strategy for
contrastive learning of medical images based on labels that can be extracted
from clinical data. In the medical field, there exists a variety of labels for
data that serve different purposes at different stages of a diagnostic and
treatment process. Clinical labels and biomarker labels are two examples. In
general, clinical labels are easier to obtain in larger quantities because they
are regularly collected during routine clinical care, while biomarker labels
require expert analysis and interpretation to obtain. Within the field of
ophthalmology, previous work has shown that clinical values exhibit
correlations with biomarker structures that manifest within optical coherence
tomography (OCT) scans. We exploit this relationship between clinical and
biomarker data to improve performance for biomarker classification. This is
accomplished by leveraging the larger amount of clinical data as pseudo-labels
for our data without biomarker labels in order to choose positive and negative
instances for training a backbone network with a supervised contrastive loss.
In this way, a backbone network learns a representation space that aligns with
the clinical data distribution available. Afterwards, we fine-tune the network
trained in this manner with the smaller amount of biomarker labeled data with a
cross-entropy loss in order to classify these key indicators of disease
directly from OCT scans. Our method is shown to outperform state of the art
self-supervised methods by as much as 5% in terms of accuracy on individual
biomarker detection.
Kiran Kokilepersaud, Mohit Prabhushankar, Ghassan AlRegib
2022-11-09